The largest eigenvalues of sample covariance 
matrices for a spiked population: 
diagonal case. 

Delphine Feral * and Sandrine Peche ^ 
December 12, 2008 

Abstract 

We consider large complex random sample covariance matrices ob- 
tained from "spiked populations", that is when the true covariance ma- 
trix is diagonal with all but finitely many eigenvalues equal to one. We 
investigate the limiting behavior of the largest eigenvalues when the pop- 
ulation and the sample sizes both become large. Under some conditions 
on moments of the sample distribution, we prove that the asymptotic fluc- 
tuations of the largest eigenvalues are the same as for a complex Gaussian 
sample with the same true covariance. The real setting is also considered. 

1 Introduction and results 

Sample covariance matrices are fundamental to multivariate statistics. Their 
spectral properties are e.g. important for Principal Component Analysis. In 
the case where the population size remains "small" while the sample size be- 
comes sufficiently large, these spectral properties are well-understood. It is a 
classical probability result that the sample covariance matrix is a good approx- 
imate of the population covariance. Nowadays it is of strong interest to study 
the case where both the sample and population sizes become large, due to the 
large amount of data available. In this setting, the study of asymptotic spectral 
properties of sample covariance matrices has many applications. The behavior 
of Principal Component Analysis has first to be understood. We refer the reader 
to [T3] and [ini for a review of other statistical applications. Other examples of 
applications include genetics [19] , mathematical finance [1] , [7] , [8] , [15] , wireless 
communication [55] , physics of mixture [23] and statistical learning [T^] . 
In this paper, we investigate the limiting distribution of the largest eigenvalues 
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of sample covariance matrices for some so-called "spiked population models" . 
Such models have been introduced for a Gaussian sample in [14] and correspond 
to the case where the true covariance is a small rank perturbation of the Identity 
matrix. In this paper, both the impact of the largest eigenvalues of the true 
covariance and that of the distribution of the sample on the asymptotic behav- 
ior of the largest eigenvalues are investigated. The fluctuations of the largest 
eigenvalues of some non necessarily Gaussian samples are compared to those of 
a Gaussian sample with the same true covariance. These questions are mainly 
motivated by statistical applications. Indeed some statistical tests are based 
on the conjecture that the behavior of the largest eigenvalues of spiked sample 
covariance matrices is the same as for a Gaussian sample provided the sample 
distribution is close to a Gaussian distribution (see e.g. [I9]). 

1.1 Model and results 

Let X — Xm he a, N X p complex (resp. real) random matrix such that 
{3?eXy , SmXy ; l<i<N,l<j<p} (resp. {X,j, I < i < N,l < j < p}) are 
real independent random variables satisfying for all 1 < i < and 1 < j < p: 

(Hi) EXy = and E {^eXi^f = E {^mXi^f = /2 (resp. EX^- = a^)- 

(H2) there exists a constant Co > independent of N,p (and (i, j)) such that 
VA:>0, ¥.\Xij\'^'' <{Cokf ] 

(H3) E ((SReXy )2fe+i) = E ((5mXy)'*''+') = (resp. E (xf^+i) = 0) Vfc > 0. 

To avoid technicalities, we assume throughout the paper that p > N. Here the 
size of the matrix X goes to infinity in such a way that if we set "/n — p/N , 

37 > 1 such that ^ 7 as ^ 00. (1) 

Let r be a given integer independent of N and p. Let also tti > > . . . > Tr^ > 1 
be given real numbers, all of which are independent of N and p. The covariance 
matrix E = Ejv is the N x N diagonal matrix 

S = diag(7ri,7r2, . . . , tt^, 1, . . . , 1). (2) 

The goal of this paper is to describe the large- A^-limiting distribution of the 
largest eigenvalues of the spiked model defined by 

Vn := (3) 

P 

Note that the spectral properties of the associated matrix VJ^ — ^X*J^X can 
be deduced from those of Vn since their non-zero eigenvalues are equal. 
Throughout this paper, the white (or null) model corresponds to T, ^ Id (or 
r = 0) and is called 

Mat := -XX*. (4) 
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When the entries of X are further assumed to be Gaussian random variables, 
we write (resp. V^) instead of Mm (resp. Vn). M§ is then a matrix from 
the so-called Laguerre unitary (resp. orthogonal) ensemble (LUE (resp. LOE)) 
of parameter a, also known as the complex (resp. real) Wishart ensembles. 

First, let us consider the global behavior of the spectrum. Let Xi{Vn) > 
A2(Vjv) > ■ ■ • > Xn{Vn) be the ordered eigenvalues of Vn and let /ijv be the 
spectral measure defined by /iat = ^ X^i^i ^Ai(yjv)- Setting 

u^^a\l±r'/'r, (5) 

it is a well-known result of [,16j that, for any matrix E given by ([2]) (including S = 
Id), fiN a-s. converges as — ^ oo to the Marchenko-Pastur distribution j-iMP 

whose density is '^^"/J'''^ = 1^^/^"^+ " " U-)l[u_,u+]{x). The global 
behavior of the spectrum is thus not impacted by the spiked structure of S. 

The situation is drastically different for the largest eigenvalues. Let us first 
recall the well-known asymptotic behavior of the largest eigenvalues of Mm- 
This asymptotic behavior has been identified for the complex or real Wishart 
ensembles in [1^ and [M] and later extended to a much wider class of 
white sample covariance matrices Mjv in [20j . To be more precise, we need the 
following definitions. We denote by Ai the standard Airy function. Define the 
Airy kernel by A(u, v) — "^^(")"^^ ^"Izt^ (u)Ai{v) ^j^^ operator acting 

on L'^{{x, +oo)) with kernel A(m, v). Let i^GU(o)E be the GU(0)E Tracy- Widom 
distribution defined in |29| . which is the limiting distribution of the largest 
eigenvalue of the Gaussian unitary (resp. orthogonal) ensemble (GU(O)E) as 
the size tends to infinity. It can in particular be shown that i^cuE is given by 
the Fredholm determinant ^GUE(a^) = dct(l — A^,). More generally, given an 
integer X > 1, we denote by ^qu(o)e ^^"^ limiting joint distribution of the K 
largest eigenvalues of the GU(0)E (the precise definitions are given in [29^ and 
[30j). Last we define 

2 A , -l/2\2 , -1/2 2 /i , -l/2\4/3 

PN = (T + ' j and (Tn = 1^ ^ + "/n ) 
The next theorem has been proved in the more general case where 7 S [0, 00]. 

Theorem 1.1. fl3^ . \T^ , f20^ Let K > 1 be an integer. Let be a complex 
(or real) random matrix given by ^ and assume that X satisfies (Hi) — (H3). 
Then, for any [xi, . . . ,Xk) G M.^ , one has that 

( Ni \ K 

lim P {\{Mn) - Pn) < Xj, Vi = 1, . . . ,X = Fq^,q.-^(xi,X2, ■ ■ ■,xk)- 

N^oo \ aN I 

Theorem 11.11 implies that a.s. limjv^oo •^i(-^^Ar) = w+ (this result is proved 
in a more general setting in |31j). The situation may be quite different if E is 
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chosen as in The first results in this direction have been obtained in [5] for 
the complex Gaussian spiked population model V^. Therein the authors point 
out a phase transition phenomenon for the fluctuations of the largest eigenvalue 
according to the value of the largest eigenvalue(s) of the covariance matrix E. 
To state the result, further definitions are needed. For all m > 1, set 
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which are integrals or derivatives of the standard Airy function (cf. 'F). For 
any > 1, define the distribution function Fi^ (see [4]) by 

ft(a;)=det(l-A,)det('^„,„-<^^^s(™\i(")>') , (6) 

for any real x, where < ., . > denotes the (real) inner product of functions in 
LF'{{x^oo)). Define also 

wc-=l + ^, (7) 



t{tTi) = CT^TTi ( 1 



Note that if tti > Wc (resp. tti Wc, resp. tti < Wc), then t{'Ki) > u+ (resp. 
r(7ri) = resp. T(7ri) < u^). 



We here give only the asymptotic behavior of the largest eigenvalue of the 
complex non- white Wishart ensemble (see Remark 1 1.1 1 for some extentions). 

Theorem 1.2. 0/ Consider the sequence of complex Wishart matrices (V^) 
when S is given by (0^. Let 1 < k < r be an integer. For any real x, one has 
that 

(i) If TTl = ...= TTk > Wc and nk+i < tti then 

lim (Ai(l/^) - r(^i)) <x)^ Gk(x), 

where Gk is the distribution of the largest eigenvalue of the un-normalized 
QUE random matrix H = {Hij)^ with i.i.d. complex standard Gaus- 
sian entries above the diagonal. 

(a) If Til = ...= Tik = Wc and Tik+i < Wc then 

lim P {Xi{V^)-pn)<x) ^Fk{x). 

(iii) If TTl < Wc then lim P (Ai(V^) - pn) < x] ^ i^GUE(a;). 
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Remark 1.1. The joint distribution of the K largest eigenvalues {K < k in. {i)) 
can easily be deduced by a straightforward extension of the arguments of [5j in 
cases (i) and (iii). In case (z), the joint distribution of the K (correctly rescaled) 
largest eigenvalues Ai(V^), 1 < i < K, converges to the law of the K largest 
eigenvalues of H (see also [2I])- In case {Hi), the full conclusion of Theorem ll.il 
holds true. 

Some extensions of Theorem 11.21 have been obtained. First, in [17 , the 
counterpart of Theorem 1 1.21 has been established for singular Wishart matrices. 
Therein, the case where p < N and limAr_»oo7Af =76 [0,1] is investigated. 
The same phase transition phenomenon is established. In [TB], real Wishart 
ensembles are considered. Unlike complex (singular or not) Wishart ensembles, 
the joint eigenvalue density is not known in the real setting. Using perturbation 
theory, it is proved that when tti > Wc is simple, the largest eigenvalue of real 
non- white Wishart matrices exhibits Gaussian fluctuations (with a different 
variance). Some more recent extensions have also been obtained in ^ and are 
recalled below. 

In view of Theorem II. 1) it is natural to investigate the question whether 
Theorem 1 1 . 2 1 f and its real analogue) would actually hold true for non Gaussian 
samples. Our main results, exposed below in Theorem 11.51 and Theorem 11.61 
answer this universality question. Before that, a partial answer has been given 
at the level of a.s. convergence by [6]. We partially state here their result. 

Theorem 1.3. ^ Let Vn be a complex or real sample covariance matrix defined 
by fjj with E given by 0). Assume that the entries of X are i.i.d. wit/iEXn = 
0, EjXiip = (7^ and E|Xii|^ < 00. Let \ <k <r he an integer. Then, 

(i) If TTi — ... ~ TTk > Wc and TTk+i < tti then Ai(V/v), . . . , Afc(T/Ar) a.s. 
converge to t{tti). 

(ii) If TTi < Wc then Ai(VAr) a.s. converges to u+. 

Regarding the fluctuations of the largest eigenvalues now, [3] determines 
their asymptotic distribution in the case where these eigenvalues are well sep- 
arated from the bulk. They consider both the complex and real models in the 
case where the entries of X are i.i.d. with finite fourth moment. They prove 
that (i) in Theorem 11.21 holds true for a wide class of non Gaussian samples X 
only if the rescaling factor a{'Ki) given in ([5]) is modified to include the "excess 

kurtosis" of the entries of X . The excess kurtosis is given by ' 2 (resp. 

^^"^"^ 3) in the complex (resp. real) case and is zero for Gaussian distribu- 
tions. One may also indicate that their result is stated in a more general setting 
than that considered here: in particular, the true covariance does not need to be 
diagonal. To ease the exposition, we give their result with the added condition 
([9]) which requires the fourth moment of Xn to be as in the Gaussian case. 

Theorem 1.4. Assume that the assumptions of Theorem \L3\ are satisfied 
with 

E(Xi\) = (l + /5V (9) 
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where (3' = \ (resp. (3' — 2) in the complex (resp. real) case. Define 

^Vn) :=^(A.(y#)-r(^i)). 
o-(vri) 

// TTi = ... = TTfc > Wc and TT/c+i < Wc then the N -limiting distribution of 
{S,i{Vn), ■ ■ ■ ,^fe(VAf)) is the joint distribution of the k eigenvalues, ordered in the 
decreasing order, of the GUE (resp. GOE) H — {Hij)'^j^^ with i.i.d. complex 
(resp. real) standard Gaussian entries above the diagonal. 

The approach of 3 , following ideas of [T^, is mainly based on the fact 
(contained in Theorem ll.3p that the largest eigenvalues split from the bulk. The 
proof relies on some perturbation theory ideas which allow to see the rescaled 
largest eigenvalues Ci(^Jv) as the eigenvalues of a fc x fc random matrix defined 
in terms of the resolvent of an underlying white matrix. The conclusion then 
essentially follows from a CLT on random sesquilinear forms, explaining the 
assumption on the first four moments of the entries Xij . 

The question of the universality of the two other regimes in Theorem 11.21 
remains open. This is the gap we here fill in and is the main result of this note. 

Theorem 1.5. Gonsider the sequence of complex sample covariance matrices 
(Vn) defined by ^ where the entries of X satisfy (Hi) — (H3) and E is given 
by 0). Let I < k < r be an integer. 
When TTi > Wc, assuming furthermore that 

(H4) E(3?eXy)'' = E(3mXy)'' = 3a^/A, VI < i < iV, VI < j < p, 

the conclusions of Theorem \1.2\ hold true for Vn • 

When TTi < Wc, the conclusion of Theorem \l.l\ is true for Vn- 

In the real setting, we prove the universality in the two non-critical regimes. 

Theorem 1.6. Consider the sequence of real sample covariance matrices (Vn) 
defined by (0) where the entries of X satisfy (Hi) — (H3) and S is given by (0). 
Let 1 < k < r be an integer. 
When TTi > Wc, assuming furthermore that 

(H^) E{Xf^)=3a\ Vl<z<7V,Vl<j <p, 

the conclusion of Theorem \1.4\ hold true for Vn . 

When TTi < Wc, the conclusion of Theorem \l.l\ is true for Vjv- 

As we will explain, the proof of these theorems is based on a combinatorial 
method combined with some results on the corresponding Gaussian model. In 
fact, our following combinatorial arguments also cover the real setting of The- 
orem 11.61 in the critical case where tti = Wc- Thus, our analysis reduces the 
universality problem in this case (under the assumption (H4)) to the knowledge 
of the asymptotic fluctuations of the largest eigenvalues of the associated real 
Gaussian model. Unfortunately the latter result is not known so far. 
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1.2 Core of the proof 

We here give the main ideas of the proof of Theorems 11.51 and 11.61 We first 
mainly concentrate on the complex setting. At the end of this section, we dis- 
cuss the main modifications needed to consider the real case. 
The proof follows essentially the strategy introduced in [5D] (see also [55] and 
[27]) and we refer to this paper for most of the detail. We also refer the reader 
to [TT] where the authors investigate Deformed Wigner matrices which are clas- 
sical Wigner matrices decentered by a particular deterministic matrix. The De- 
formed Wigner model can be seen as the additive analogue of the present model. 
In particular, it exhibits a similar phase transition phenomenon regarding the 
asymptotic behavior of the largest eigenvalues. [11] establishes the universality 
of the fluctuations of the largest eigenvalues for non-necessarily Gaussian De- 
formed Wigner matrices. The approach developed here is close to that of [TT] 
and is mainly based on combinatorial arguments. 

Basically, and for each of the three regimes depending on the value of tti, 
we compute the leading term in the asymptotic expansion of expectations (and 
also higher moments) of traces of high powers of Vn that is E(TrV^") where 
Tr denotes the classical (un-normalized) trace. We consider specific exponents 
Sn which depend on the scaling of the fluctuations of the largest eigenvalue(s) 
when the size N goes to oo. The core of the proof is to show the universality of 
moments (of any fixed order) of traces of powers of in these scales. Let us 
explain this more precisely in the particular case of the expectation. 

In the case where tti > Wc, it is expected that the largest eigenvalue(s) 
exhibits fluctuations in the scale 7V~^/^ around T(7ri). We thus consider an 
arbitrary sequence of integers (sn) such that limjv sj^/N^/^ — c for some con- 
stant c > 0. We first show that E(Tr(VAr/r(7ri))*") is bounded. Then, we prove 
that the leading term in the asymptotic expansion of E(TrV^") depends on tti, 
on the variance and on the fourth moment of the Xij^s only. Assume now 
that (H4) is satisfied i.e. the fourth moment of the X^ 's is taken to be that of 
the Gaussian distribution with variance cr^. Then, up to a negligible error, the 
expectation E(TrV^") does not depend asymptotically on the particular law of 
the entries and one has that 

E(TrV^^") = E[Tr(yj^)^"](l + 0(1)). (10) 

In the critical case where tti — Wc, the largest eigenvalue fiuctuates now in 
the scale N~^^^ around the right-edge m+ of the Marchenko-Pastur support. 
The powers sn to be considered are such that limjv s n /N'^/^ = c for some 
constant c > and we first show that E(Tr(VAr/M+)*") is bounded. We then 
prove that the leading term in the asymptotic expansion of E(TrV^") depends 
on cr^ and on the fourth moment of the entries of X only. So assuming again 
that (II4) holds true, we show that the expectation behaves in the large TV-limit 
as in the Gaussian case and that (fTO|) still holds true. 

In the sub-critical case tti < Wc, we still consider powers sn in the order 
of N"^/^ . Here, the fluctuations of the largest eigenvalues are expected not to 
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depend on tti and to be exactly as in the white case where E — Id. We get this 
by showing that E(TrV^") = E[TrM^"](l + o(l)). Thanks to the investigations 
of [50] on the universahty of the fluctuations of the largest eigenvalues of the 
white matrices Afjv, we can deduce that 

E(W^") = E[Tr(M^)^«](l + o(l)). (11) 

Actually, we prove universality of all the moments (of fixed order) of traces 
of high powers of V^v as in pO)) and pT|) . Using the machinery developed in [26] 
(Sections 2 and 5) and (Section 2), we can then deduce that the limiting 
distribution of the largest eigenvalue(s) for spiked population matrices V^r satis- 
fying (Hi)-(H3) as well as (H4) if tti > Wc, is the same as for complex non-white 
Wishart matrices . When tti < Wc, we more generally get the universality of 
the limiting joint distribution of any fixed number of largest eigenvalues. Let us 
roughly give the main ideas. On the one hand, the Laplace transform of the joint 
distribution of a finite number of the (correctly rescaled) largest eigenvalues of 
Vn can be conveniently expressed in terms of joint moments of traces of the 
matrix Vn taken at suitable powers s n (those of the previous discussion) . On 
the other hand, the asymptotic distribution of the rescaled largest eigenvalues 
(and also the corresponding Laplace transform) is well-known in the complex 
Gaussian setting. One can then deduce from universality of moments of traces 
that the asymptotic joint distribution of the largest eigenvalues for any model 
Vn considered here is the same as for the corresponding Gaussian case. The 
detail of the derivation of such a result from formulas ITUl) and pTjl . includ- 
ing the required asymptotics of correlation functions for the complex non-white 
Wishart matrix V^ , can be found in [11], [5], [26] and [27] . 

In the real setting, our combinatorial reasoning also yields the universality 
of moments of traces of high powers of Vn- This could be used in principle 
to prove universality of the fluctuations of the largest eigenvalues, provided the 
full counterpart of Theorem 1 1.21 in the real Gaussian case was fully established. 
This is true in the case where the largest eigenvalue of the true covariance is 
simple and satisfies tti > Wc (cf. [18]). The non Gaussian case is actually also 
covered by Theorem ll.4l We can come to the same conclusion (assuming (H'4)) 
using our approach. In the sub-critical case where tti < Wc, we are also able 
to conclude thanks to ([TT]) (and its analogue for higher moments) which proves 
that the fluctuations of eigenvalues of Vn are similar to those, well-known, of 
the real Wishart matrix M^. In the critical case tti = Wc, we cannot conclude. 

Our paper is organized as follows. In Section [21 we introduce the major 
combinatorial tools needed to compute moments of traces of high powers of 
Vn- We first recall the specific terminology and the main arguments (Section 
12. 2p developed by ^20] for the investigations of the white case. We then present 
(Section l2.3p the main ideas of the strategy we will use to deal with the non- white 
case when r = 1. In Section [S] we establish the universality of the asymptotic 
expectation of traces of high powers of Vn - We next consider higher moments 
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in Section S) In Section [SI we discuss the main modifications needed to deal 
with the case where r > 1. 

Acknowledgments. A part of this work was done while the first author pre- 
pared her PhD Thesis at the Institut de Mathematiques de Toulouse and she 
acknowledges useful conversations with her advisor M. Ledoux. Some results 
were also obtained last year during a postdoctoral fellow at the Hausdorff Re- 
search Institute for Mathematics of Bonn. 

2 Combinatorial tools 

In this section, we define the major combinatorial tools needed to investigate 
the asymptotics of moments of traces of large powers of Vn- We here extend 
some of the tools used in [20 where the white case (S = Id) is investigated. 
The reader is referred to Sections 2 and 3 of the above cited article for a detailed 
explanation of the following combinatorial approach. We here choose to explain 
our strategy in the case where r = I and thus consider the covariance matrix 

E = diag(7ri, 1, 1, ... , 1). 

Modifications to handle more complex cases (r > 1) are indicated in Section [51 
Thoughout the paper, we denote by C,C',Ci,i = 1,2,... some positive con- 
stants independent of N and whose value may vary from line to line. 

2.1 Paths and 1-edges 

Let {sn) be a sequence of integers that may grow to infinity. Developing the 
trace, one obtains that 

E[Tryj^"] (12) 

J0,ll,---,4sjv-l \ 9=0 / 

wherei, e {1,2,...,7V} and e {1, 2, . . . , TV, . . . ,p} (14) 
and where we use the convention that i^^ — iq. 

To each term Jlg^cT^ ^«qjq+i^2<!+ij?+i ™ P^ - we associate three combinato- 
rial objects needed in the following. First, we define the "edge path" V formed 
with oriented edges (read from bottom to top) by 
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Due to the symmetry assumption (H3), only paths for which any oriented edge 
occurs an even number of times give a non zero contribution to the expectation. 
From now on, we only consider such even paths. 

To an arbitrary even edge path we associate a so-called Dyck path (Dyck 
paths have a long history in Random Matrix Theory, see [35] i [2] for instance), 
that is a trajectory x — {x{t),t G [0, 2sAr]} on the positive half-lattice such that: 

x{Q) = x{2sn) = 0; Vi G [0, 2sn], x{t) > and x{t) - x{t - 1) = ±1. 

To define the Dyck path x associated to 7^, we read the oriented edges of V in the 
order of appearance and draw an up (resp. down) step (1, +1) (resp. (1, —1)) if 
the current edge is read for an odd (resp. even) number of times. Last, we also 
associate to P a " usual" path denoted by P : we mark on the underlying Dyck 
path X the successive vertices met in V and then set P := io ji ii j2 ■ ■ - io- 

The strategy in the rest of the paper can roughly be summarized as follows. 
Given a trajectory x, we shall estimate the number of edge paths that can be 
associated to x and then we shall estimate their contribution to the expectation 
p^ . On the one hand, due to the constraint on the choice of the vertices, 
we shall refine the enumeration of Dyck paths according to the number of odd 
up steps. The way to handle such a specificity has been developed in detail in 
[20] . We recall some points of the analysis made by "20] in the next subsection. 
On the other hand, when estimating the contribution of such edge paths, we 
also have to take into account the occurrences of the vertex 1 on the bottom 
line since each occurrence yields an additional weight tti (recall ([T3|) ). To this 
aim, we introduce the notion of 1-edge. 

Definition 2.1. A 1-edge is an oriented edge with 1 on the bottom line i.e. an 
edge where h d {1, . . . 

Remark 2.1. An edge is not a 1-edge in our denomination if /i e {2, . . . , N}. 

Thus, we shall be able to refine the analysis made in [20] to estimate the 
contribution of edge paths with 1-edges. 

The rest of this section is organized as follows. In Subsection 12.21 we recall the 
main definitions and results of [5D| that we will use throughout this paper. Note 
that the investigations of [5D| readily gives the contribution to the expectation 
(fT2)) of edge paths without 1-edges (see Proposition 12. 2p . In Subsection 12. 3[ 
we explain the main ideas of the strategy we will use to deal with paths with 
1-edges and compute the corresponding contribution to the expectation (fT^ . 

2.2 The white case = Id and paths with no 1-edges 

The aim of this section is to recall the main definitions and results derived 
from [2D| that we will use throughout this paper. These results also allow us to 
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estimate at the end of this subsection the contribution to of paths with- 
out 1-edges. We assume some famiUarity of the reader with the combinatorial 
machinery developed in refs. [Mjj [IS], [55] and [5D]. 

Notational Remark. Our notations differ from those used in [:20 : the (white) 

— 2 Q 

model considered by |20J corresponds here to ^^XX* that is jnMn. 

To handle the case where Y, — Id (that is the computations of ETr Af^" ) , [50] 
first enumerates the associated Dyck paths according to the number of odd up 
steps. Thus, we let Xsw./c be the set of Dyck paths of length 2sn with k odd up 
steps and Xsn — ^iZiXsN,k be the set of Dyck paths of length 2s n- 

Definition 2.2. ^9] Let N(sjv, k) he the kth Narayana number defined by 
Then N(sjv, k) = tJA'^^.i;. 

The reader is referred to [ID] and references therein for further detail about 
Narayana numbers. 

Given a Dyck path x E Xsj^.k we shall estimate the number of edge paths 
associated to it. First, one needs to assign a vertex from {l,...,iV} (resp. 
{1, . . . to each even (resp. odd) moment of time along the Dyck path x. 
For this, we need the following definition. 

Definition 2.3. An instant t G [l,2sAr] is said to he marked (in x or in P) if 

it corresponds to the right endpoint of an up edge. 

Roughly speaking, marked instants correspond to the moments of time 
(apart from t — 0) where one can discover in P a vertex never encountered 
before. Thus, in order to estimate the number of paths that can be associated 
to a given trajectory x, we first choose the vertices occurring at the marked 
instants, which we call marked vertices, and at the origin of the path (which is 
non marked by definition). 

In order to choose the vertices occurring at the marked instants, we refine our 
classification separating the cases where they are on the bottom or top line 
in V, that is the cases where they are marked at even or odd instants in P, 
as follows. For any integer < i < sjv, we define two classes of marked ver- 
tices: Afi — { vertices occuring i times at an even marked instant } and 7^ = 
{ vertices occuring i times at an odd marked instant } and we set = jJA/i and 
Pi = ftTi. Then (cf. (HH)), vertices encountered along a path P at the odd (resp. 
even) instants split into the disjoint classes % (resp. A/i). Observe that Ui = 0, 
Vi > s^r — fc with ~ ^ ^i^'i ~ Sat — fc. Similarly, one has that 

Pi = 0, \/i > k with J2iPi — P ^i^d J2i ''■Pi = k. A path P can then be charac- 
terized by its associated trajectory x G XsN,k for some integer 1 < k < sn, and 
its type: 

(no,ni, . . . ,nsj^-k){pi, ■ ■ ■ ,Pk) ■= {n,p)- 
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Definition 2.4. A vertex v G Aft (resp. v £ %) is said to be of type i on the 
bottom (resp. top) line. 

Any vertex v G Ui>2A/i (resp. v £ l->i>2%.) is said to be a vertex of self- 
intersection on the bottom (resp. top) line. 

In the following, Mi — J2i>2i^ ~ (resp. M2 = X]i>2(* ^ ^)Pi) denotes 
the number of vertices of self-intersection on the bottom (resp. top) line. 



It is an easy fact that, given the type {n,p), the number of ways to assign 
vertices at the marked instants and choose the origin is at most 

n:fo"'-«!ntoP^!n.>2(*!)-n.>2(*!)^-- ^ ^ 

Once marked vertices are chosen, there remains to count the number of ways 
to fill in the blanks of the path i.e. assign vertices at the unmarked instants 
and evaluate the corresponding expectation of each "filled path" . Due to self- 
intersections, there may be many ways to fill in the blanks of the path as well 
as edges seen many times. Actually the bound ([TT]) would be enough for the 
following as long as sjv = o{\fN). It needs to be refined for higher scales sn- 
In particular and as explained in the beginning of Section 3.2 in |20j . one must 
pay attention to vertices of type 2. In P7|) . we used the rough estimate that if 
t G {1, . . . , A;} (resp. t G {1, . . . , sat — k}) is an odd (resp. even) marked instant 
where the second occurrence of a vertex of type 2 is repeated, there are at most 
t—1 possible choices for the vertex to be repeated. This rough estimate needs 
to be refined when considering vertices v of type 2 belonging to edges seen more 
than twice and vertices v of type 2 for which there are multiple ways to close 
an edge with v as its left endpoint at an unmarked instant. 
Let us first consider the latter class of vertices of type 2. Note that for such 
a vertex w, there are at most three possible ways to close an edge with v as 
left endpoint at an unmarked instant. To investigate this class, we need a few 
definitions from [^ . 

Definition 2.5. A vertex v of type 2 is said to be non-MP-closed if it is an odd 
(resp. even) marked instant and if there is an ambiguity for closing an edge at 
an unmarked instant starting from this vertex on the top (resp. bottom) line. 

Let i be a given marked instant. Assume that the marked vertices before t have 
been chosen and that, at the instant t, there is a non-MP-closed vertex. Then, 
by definition of x and of non-MP-closed vertices, there are at most x(t) possible 
choices for this vertex. This can be checked as in [25j, p. 122. 

Let us turn to vertices of type 2 which belong to an edge that is read four 
times or more in the path. To consider such vertices, we need to introduce 
other characteristics of the path. Let vj^ :— i'j^(P) be the maximal number of 
vertices that can be visited at marked instants from a given vertex of the path 
P. Let also Tjv := T]y{P) be the maximal type of a vertex in P. Then, if at the 
instant t, one reads for the second time an oriented up edge e, there are at most 
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2{vn + Tn) choices for the vertex occurring at the instant t. Indeed, one shall 
look among the oriented edges already encountered in the path and for which 
one endpoint is the vertex occurring at time t — 1 (see the Appendix in |25j and 
Section 5.1.2 in [TT] e.g.). 



Furthermore, the machinery developed by [5S], [53] and [5D] shows that 
once the marked vertices are assigned along a trajectory x and once the origin 
is chosen, the number of filled paths, weighted by their expectation, which are 
associated to x (and of type {n,p)) is bounded by 

TT (c^iO'"' TT (CiTO)'"P"3'^i+'^^cf c|^ (18) 

^ ;=3 m=3 

where the extra factor 2 comes from the negligible case where the origin ig is 
marked; the C^'s are positive constants independent of k,p, N and sn; ri,i = 1,2 
(resp. qi,i — 1,2) count the number of vertices of type 2 on the bottom/top 
line which are non-MP-closed (resp. belong to an edge seen more than twice). 



Combining the above with some ideas previously developed in the above cited 
papers, it is shown in Section 3 in [20] that the contribution to ETrAf^" from 
paths with k odd marked instants and of type (n, p) is bounded by 



CNisN,k)N^'^^ 



2N 



^3(sAr — k) max x{t)^ {^i{si^ — k){h'N + TV) 



ri! 



91 ! 



(^{^N^y (3fcmaxa;(i)y 



(n2 

2 \ P2-r2-q2 



ri - 9i)! 



r2'- 



C2k{vN + Tn, 



92 



{P2 



f 2 - 92 j 



92! 



i>3 ^ 



n 

i>3 



(k-M2y 

2p 



(19) 



where 

- C, Ci and C2 are positive constants independent oi k, p, N and sn] 

- niaxx{t) is the maximal level reached by a trajectory a:; 

- Efe is the expectation with respect to the uniform distribution on A'^^^fc. 
Moreover, [20] (Section 3) establishes important estimates on the previous quan- 
tities. Proposition 3.1 in proves that typical paths (that is paths which con- 
tribute in a non negligible way to ETrM^™) satisfy the following constraints: 

a) the number k of odd marked instants lies in the interval [a'sN,asN] for 
any a', a such that < a' < < a < 1; 



b) l^N + Tn « y/SN] 
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c) there exists a constant c > 0, independent of k, p, N and sn, sueh that 
Ml + M2 < Cy/sJ^. 

Besides, j^QJ also proves that maxx(i) ~ ^/s]v in typical paths. More precisely, 
it is shown in Lemma 3.1 in |20] that 

Va > 0, 3C(a) < 00, max E^, [exp {amaxa;(t)/^/s;^}] < C(a). (20) 



All these results combined with lead to one of the main results of Section 
3 in [2U]. 

Proposition 2.1. IfT, = Id and sm = 0{N^^^), the paths with k odd marked 
instants contributing to ETrAf^" in a non-negligible way have edges read only 
twice, a non marked origin and no vertex of type strictly greater than 3. Fur- 
thermore, there exists a constant C > such that their contribution is at most 

N(.^, fc) A^7^^" a^^" E^exp (g ^^^ )] exp . (21) 

From these computations, one readily deduces that the contribution to the 
expectation (fT2|) from paths without 1-edges is characterized as follows. 



Proposition 2.2. The typical contribution to the expectation mS\) from paths 
with no 1-edges is at most of the order of 



^ N(s^,fc)7V^a^^"E, 



„2.„^, r exp fe-'''^ ^^^^ 



N 



where C and C' are positive constant independent of N . Typical paths amongst 
those without 1-edges have edges read only twice, a non marked origin and ver- 
tices of type at most 3. 

Proof of Proposition [2T2l Here the vertices ij 's must be chosen from the 
set {2, . . . , TV} instead of {1, . . . , iV}, since the vertex 1 is assumed not to oc- 
cur on the bottom line. Formula (jl9p must be simply multiplied by a factor 
(1 — jY-i^sN-fe+i This has no impact on the final result for N large enough. 
Proposition 12.21 follows by summation on the typical fc's and the fact that 
Efe"i N(sjv, fc) A^T^T'" = O {{u+/(j^Y") (see Remark 2.4 in [20]). □ 
Notational Remark. From now on, we simplify the notations and use P to denote 
a usual path as well as its associated edge path V. 

We shall now be able to refine the counting procedure of [5D| to estimate 
the contribution to the expectation from edge paths with 1-edges. 
The problem of evaluating directly the number of 1-edges occurring in a path 
turns out to be difficult. Thus, our strategy will be indirect. Instead of directly 
evaluating the contribution of paths P as well as the number of its 1-edges, we 
first evaluate the contribution of paths with a prescribed number of 1-edges. 
This is the aim of the following subsection. 
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2.3 Counting the number of 1-edges 



In this subsection, we define a procedure called gluing procedure which allows to 
enumerate the paths P according to the number of their 1-edges. One can first 
notice that the contribution of a path P with 1-edges to the expectation ([T^ is 
a weighted term related to its contribution to the expectation E[TrM^™], where 
M]\[ = ^XX* is the associated white matrix. One simply assigns a weight tti 
to each occurrence of the vertex 1 on the bottom line of P. Consider for a while 
a path P of length 2s n having s > 1 pairs of 1-edges and with 1 for origin. 
The basic idea is that the vertex 1 necessarily occurs on the bottom line of P 
at the instants where the trajectory a: of P hits the level 0. If one furthermore 
assumes that x hits exactly s times the level 0, then all the occurrences of 1 on 
the bottom line of P correspond to the returns to of x. Thus the enumeration 
of 1-edges transfers to statistics on the number of returns to for Dyck paths. 
This observation is the basic idea of the gluing procedure. 

Throughout the paper, we denote by 2s the number of 1-edges of a path P. 
Starting from a general path P of length 2s n, the gluing procedure associates 
a new path P' with origin 1 as we now explain. 

2.3.1 Subpaths starting and ending with a 1-edge 

We denote by (^j^ (^^0' * ~ ^^---'^ the pairs (not necessarily distinct) of 
successive 1-edges occuring in P. One can then write P as 

OA f ji\ (hi\ (gi\ (h2\ (g2\ (h. 



^ 'ioJ\iiJ \1 J \l J \1J\1J \1 

1/ Vs^-lJ V «0 



Using these 1-edges, P splits into s subpaths Pi, i = 1, . . . , s, defined as follows. 



For i = 2, . . . , s, we call Pi the subpath starting at ^ ^ j and ending y ^ 

Let .h„ P, be the subp.th (»■) ■ ^ ^ (■'';) . g) ■ ^ ^ 

Remark 2.2. In the particular case where jq = 1, the edge path reads as 



P = 



The sole difference here is that gs = ji and Pi is the subpath beginning P. 



Let ti, . . . ,ts denote the instants at which the successive s pairs of 1-edges 
occur in P. The length h of each subpath is determined by the instants ti since 
li = ti+i — ti (note that these lengths are necessarily even). One can also 
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note that the 1-cdgcs occiirring in the path are necessarily even. Thus, for any 
1 < i < s, there exists j such that hi = hj or hi = gj. This observation is crucial 
for the sequel. As we will explain in the following subsection, this allows us to 
re-order the paths Pi and erase some of the 1-edges. 



2.3.2 Gluing and reordering the paths Pj 

Given the set of 1-edges, we now define a graph G on the set C = {gi-i,hi, i = 
1, . . . s} (#£ < s) of the vertices occurring in 1-edges, using the convention 
that qq — Qs- We draw an edge between and hi for i = 1, . . . ,s. Note 
that multiple connections are allowed. We denote by /, 1 < Z < s, the number of 
connected components of G. Wc then group all together the subpaths associated 
to vertices of the same connected component of G. This leads to / subsets which 
we call clusters. Clusters are ordered in the order they are encountered in P 
and we denote them hy Sj, 1 < j <l. 

These clusters will now be used to build a new path P' from P as follows. 
We first define a way to glue the subpaths belonging to the same cluster. For 
any 1 < j < wc will denote by P? the final path obtained by the gluing of the 
siibpaths from the cluster Sj . 

• Assume first that ft-C = s so that each 1-edge occurs exactly twice in P. 
Consider the first cluster which, by definition, begins with the subpath Pi . 

We first read Pi until meeting the edge • If fls = h\ (then <Si contains 

only Pi), then the process stops and the path Pf is equal to Pi. Otherwise, 

there exists jo > 2 such that Pj^ has the edge {^^^ '^^ endpoint. In the 

case where ^^^^ is the left endpoint of Pj^, we concatenate Pi and Pj^ 

and then erase the two occurrences of the 1-edge {^-^ ■ ^^^^ where 

the edge (^^^ the right endpoint of Pj^, we read Pj^ in the reverse 

order and apply a similar procedure. This "gluing" defines a new subpath 
which we denote by Pi V Pj^ . We then restart the procedure with Pi 
replaced with Pi V Pj^ until all the subpaths belonging to the first cluster 
are glued leading to the final subpath Pf . We then proceed in the same 
way with other clusters. 

• If ji£ < s then some clusters have 1-cdges that occur four times or more 
in P. Wc can find a way to read all the edges of such a cluster without 
"raising the pen". This follows from the fact that the vertices of G are 
all of even valency. We then choose one way to do so and glue the paths 
of these clusters accordingly. For clusters having 1-edges that occur only 
twice, we apply the previous gluing method. 
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We end up with / paths -Pf , j — which begin and end with a 1-edge. By 

definition of the clusters, these 1-edges form I pairwise distinct pairs of oriented 
edges. We then call P' the path obtained by the concatenation of the Pj 's that 
is P' = Pf u ... U . The length of P' is 2{sn - (s - /)) and its origin is 1, 
which is a non marked vertex on the bottom line. We call x' its trajectory. 

The basic idea of the gluing procedure defined above can be roughly explained 
as follows. A path P (or P') is said to be typical if it contributes in a non- 
negligible way to the expectation p^ . We first identify the typical paths P'. 
The simplest of these typical paths are such that the number of occurrences of 1- 
edges is determined by the number of returns to of their associated trajectory 
x' . Then, given a typical path P', one has to estimate the number of paths 
P that can be associated to it as well as their expectation. When considering 
the expectation, we shall take into account the added weight due to the erased 
1-edges. This problem will be considered in the following section. We here 
establish the needed estimate for the number of preimages P of a path P'. 

2.3.3 Number of preimages of a glued path P' 

The simplest case is when / — s since all the preimages P coincide with P' up 
to the translation of the origin. Then if the first return to of the trajectory x' 
associated to P' holds at the instant T — 2si, there are exactly (resp. at most) 
si preimages of P' if x' returns m ~ s (resp. m < s) times to the level 0. In 
the other case where I < s, the following estimate holds true. 

Lemma 2.1. Assume that the first return to of x' holds at time T — 2si. 
Then the number of preimages P of the path P' does not exceed 

si (^^ {2snY-' . (22) 

Proof of Lemma I2.lt In order to reconstruct the initial subpaths Pi,i = 
1, . . . , s from P', we first need to choose the s — I instants where we have erased 
1-edges. The set T of these instants combined with the I occurrences of pairs of 
1-edges in P' (which determine the paths Pj ) define s subpaths P/ which are 
the subpaths Pi possibly read in the reverse direction. Then one has to define 
the order in which the subpaths Pj are read. The sole constraint on this order 
bears on the path starting each cluster, as we now explain. Consider for instance 
a cluster, say Sj in P and its corresponding counterpart Pj in P'. Call t the 
first instant of T chosen in P^ . Then the subpath P/ starting Pj and ending at 
t is the first subpath (with the same direction) of the cluster Sj met in P. Last 
and in order to define completely the path P, one also has to choose the origin 
of P. Thanks to the above, we can now show that the number of preimages of 
a given path P' does not exceed: 
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It is clear that the previous binomial coefRcient comes from the choice of the 
s — I instants of T (noticing that these instants are necessarily odd). To explain 
the remaining terms in (^5)1 , we denote by Xj the number of subpaths Pi in each 
cluster Sj,l < j < I. Then, let us consider the first cluster Si. As clusters are 
interlaced in P, we also have to choose the places where we read the xi — 1 
paths of Si (different from Pi) and choose the order in which we read them. 
There are {^~^i)ixi — 1)! such choices. Furthermore one can also choose the 
direction in which one reads each of the P/ (to obtain Pi) not beginning Pf . 
There are two choices for this direction. Having done so, the first empty "slot" 
corresponds necessarily to the time where we read the first path of the second 
cluster. We use the same procedure to define the order and the direction in 
which the remaining subpaths of the second (and subsequent) clusters are read. 
Thus the number of ways to determine and reorder the subpaths Pi is at most 

where we took the convention that xq = 0. Last, we shall add a term si coming 
from the determination of the origin io of the initial path P (which amounts to 
choosing a vertex occurring on the bottom line of Pi). This yields ((23)) and it 
is then easy to deduce Lemma [2. II □ 

3 Estimate of E [TrV^J^] when S = diag(7ri, 1, . . . , 1) 

We here prove the universality of the expectation l|12p in various scales sn 
depending on the value of tti with respect to the critical value = 1 + 1/ ^/j- 
Let c > be a given real number. In the next theorem, (sat) is a sequence of 
integers such that 

limAT^oo = c if TTi > Wc, 

liniAT^oo -0/3 = C if TTl < Wc- 

Theorem 3.1. Let Vn be a complex (resp. real) matrix satisfying (Hi) — (H3). 
//tti > Wc, we also assume that Vn satisfies (H4) (resp. (H'^)). 

(i) Assume that tti > Wc. Then there exists a constant C4 > which depends 
on maxj ]E(| Xij I*) such that for N large enough, 

E[TrT/j^"] < (74T(7ri)^" and E[Tr F^"] = E[Tr (V^)^"] (1 + o(l)). 

(ii) Assume that tti — Wq. Then there exists a constant C4 > which depends 
on maxj E(|Xij I*) such that for N large enough, 

E[TrV^"] < C^M^" and E[Tr Vj^"] = E[Tr (T/j^)""] (1 + o(l)). 
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(in) Assume that tti < Wc- Then there exists a constant C > such that for 
N large enough, 

E[Tryj^"] < Cu+" and E[Tr Fj^"] = E[Tr (M^)^"] (1 + o(l)). 

More precisely, in (i) if Vn is complex, one can show that E[TrVj^"] = 

(l+o(l))T(7ri)''" exp [(s^/2A^) ^ ]. In the real setting, the same estimate 

holds with a{TT\) replaced by \/2(j(7ri). These estimates can trivially be deduced 
from Theorcm ll.2l ii') and its real counterpart (due to [TB]) combined with some 
considerations close to those made in Section 2 of [TT] , 

We point out that in the three regimes, the asymptotics of E [TrV^"] differ in the 
complex and real settings. This is not surprising since the limiting distributions 
of the largest eigenvalues are different. Through the combinatorial analysis, this 
fact is justified by the existence of non-MP-closed vertices in some typical paths. 
The investigation of such vertices is here really similar to that made in piT and 
we refer to Section 12.21 above and ^ for more detail. 

This section is devoted to the proof of Theorem 13.11 The contribution to the 
expectation of paths with no 1-edges has been evaluated in Proposition 
12.21 This section is devoted to the estimation of the contribution from edge 
paths P having 1-edges. We shall show that if tti < Wc, this contribution 
is negligible with respect to that from paths without 1-edges (which is of the 
order of E [TrM^™], cf. Section [2T2l above) . On the other hand, when tti > Wc, 
we shall prove that paths with 1-edges contribute in a non-negligible way. As 
announced, our proof will make use of the gluing procedure. Thus, we will 
first consider the glued paths P' and find the typical ones that is those which 
contribute in a non-negligible way to the expectation. We will easily see that 
the typical paths P' have all their edges passed twice. Then, given a typical 
path P' , we shall estimate the contribution of all its preimages P. This will 
require to examine the added weight due to the erased 1-edges: by construction 
of the gluing procedure, it may happen that P has some 1-edges passed at least 
four times. We shall check that the contribution of the typical paths P depends 
only on the second and fourth moments of the X^-'s. 

Before we proceed, we need a few notations. A glued path P' is of length 
2{sN — (s — I)) where s (resp. I) denotes the number of pairs of 1-edges (resp. 
of clusters) in its preimages P. Note that s — I counts the number of pairs of 
1-edges that have been erased by the gluing procedure; I counts the pairs of 1- 
edges in P'. Besides, the origin of P' is the vertex 1 and is a non marked vertex 
on the bottom line. Throughout the paper, we also denote by m the number of 
times the trajectory of P' goes back to the level 0. Note that in general m < I. 

Definition 3.1. We call N(sAr — (s — l),k,m) the number of Dyck paths of 
length 2{sn ^ (s — I)) with k odd marked instants and m returns to 0. 
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The simplest case to deal with is when m — I, that is when all the occurrences 
of the vertex 1 on the bottom line of P' are encountered at the instants where 
its trajectory hits the level 0. Thus, when m = I, estimating the contribution 
to of paths P follows from statistics on the number of returns to of the 
underlying Dyck paths of P' (each return is weighted by tti). This observation 
justifies the following definition. 

Definition 3.2. A path P' is said to be a fundamental path if all its 1-edges 
occur at level 0. 

In the following subsection, we present the detailed computations of the 
contribution to (fT2|) from edge paths P associated to a fundamental path P' . In 
Subsection [521 we consider the set of non- fundamental paths P' that is the case 
where some clusters in the initial paths P do share edges in such a way that 
some 1-edges in P' occur at levels greater than 0. As we will see, this requires 
to refine the analysis and define a new gluing procedure. 

3.1 All the occurrences of the vertex 1 on the bottom line 
are made at level 

In this subsection, we investigate the set of paths P such that the I different 
clusters Sj,j = l,...,l, after the gluing procedure, yield a fundamental path P'. 
We start from such P' and examine the possible added weight when reversing 
the gluing process, that is the expectation from the erased 1-edges. The simplest 
case, examined in the subsequent proposition, is when there is no erased 1-edge 
through the gluing process. 

Denote by Zi the contribution of paths P having 1-edges passed at most twice 
and whose associated glued path P' satisfies m — I = s. P' has then s returns 
to and length 2sjv- 

Proposition 3.1. One has that 

(i) Zi = 0(l)T(7ri)"" z/tti > Wc and sn = 0{Vn), 

(li) Zi = 0(1X" i/vri = Wc and sn = OiN"^/^), 

(iii) Zi = o(l)u+" i/tti <Wc andsN = 0{N^^^). 

Proof of Proposition I3.lt By assumption, the paths P and P' coincide up 
to a translation of the origin. Furthermore, the vertex 1 is the origin of P' , is 
non marked and, by definition of the gluing procedure, the path P' has 1-edges 
seen at most twice. To define P from P', one only has to determine the origin 
of the path P, which amounts to choosing an even instant before the first return 
to of the trajectory associated to P'. 

Let a, a' be such that < a' < ., /V < a < 1- Call Zi the contribution 

' H-V7 

from paths P associated to fundamental paths and for which the number k of 
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odd up steps satisfies a'sN < k < asN- The computations of [5U] (Section 3), 
summarized in Sect ion [2?2l can be copied to show that typical paths P' of length 
2sjv having k odd up steps and s returns to the level have edges passed only 
twice. Let E^^^ denote the expectation with respect to the uniform distribution 
on the set of Dyck paths with k odd up steps and s returns to the level 0. It is 
in particular a minor modification to show that the estimate ([^Dll holds when 
Efe is replaced with Kk^s (and N(sjv,fc) with TSI(sn , k, s)) uniformly in s : the 
proof can be deduced for instance using arguments given in [11) . Lemma 7.10 
(2"'^ case) . As already recalled in Section 12.21 it is also proved in [50] that a 
typical path of length 2sn with s returns to has an unmarked origin, no edge 
read more than twice and no vertex of type strictly greater than 3. Thus, we 
can deduce that typical paths P in Zi have edges passed only twice. From the 
above, we deduce that 

sjv— si 

Zi= 0(l)xa2s« J2 J2 siN(si-l,si-A:i) ^ 

l<si<SN l<ki<SN m=l 

kl+SN —SI 

N(sjv-si,fc-fci,m-l)7r™7^-^". (24) 

k—k\ +m— 1 

We now consider the paths P contributing to Zi and for which k > asN or 
k < a'sjv and show that they contribute in a negligible way to Zi. To this aim, 



let k 



+ 1. Here we show that there exists a constant C > such 



that for any integer n (with < k + n < sn), 

N{sN, k + n, m)7^+" < Ce-^"'/^«N(sjv, k, m)-i%. (25) 

This will imply that the main contribution to Zi comes from paths with ap- 
proximately k odd marked instants, so that Zi = Zi{l + o(l)). To prove (|25p . 
we write 

N(sjv,fc + n,m)7^+" = ^ ^ ]J N(s, - 1, - fc,)7^S 

si ,...,s„i ki,...,km ^—1 

where the starred sums bear on integers Si summing to s^r and ki summing to 



k + n. We also set kt = 
Remark 2.4 of f2D], one has that 



(+1) so that J2i ^i — ^- Using the ideas of 



N(s, - 1, - ki)-f''^ < e^~^'^^'iT^^N(s, - 1, - fc,)7ir' 

for some constant C" > 0, independently of Si. Furthermore setting ki — ki 
Xi{si — 1), one can easily show that 

* / m 

E oxp -c'j: 

ki,...,km \ 
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z— 1 / z— 1 

n v^^, 

4=1 

for some constant C independent of the s^'s and m. Moreover, one can show 
that N(sjv, A:, m) < CJ^i — 1 N(si — 1, Si — ki). Indeed, given Si, the number 
of ki contributing in a non neghgible way to Hl^i N(si — 1, Si — ki)j^, 
where fc^ = fc, is of the order y'si — 1 and each product term is of order 

of YiiLi N(si — 1, Si — ki)"f'^ (uniformly in Si). Thus we get ((25|) and we can 
conclude directly that the contribution of paths for which k > asN ot k < a'sjv 
is negligible and that Zi = Zi{l + o(l)). 

There now remains to prove that (|24|) yields Proposition 13.11 This is obtained 
from Lemma |3 . II stated and proved below. □ 

We now turn to the proof of the announced Lemma 13.11 Let n > 1 be an 
integer. Set 

n n n— SI ki-\-n—si ^ ^ 

«„= ^ ^ siN(si-l,si-fci) ^ N(n~si,fc-fci,s-l)^i^, 

si = l/ci = l s=l k=ki+s-l 

(26) 

so that (Ell) = 0(1) X cr2^"a,.„. 

Lemma 3.1. Let = a'^^Un- For n large enough, one has that 

(i) if TTi > Wc then ajj — r(7ri)"(l + o(l)); (m) if tti — Wc then o!^ = u" (1 + 

o(l)); (Hi) if m < Wc then a'^ = -^u^^{l + o(l)). 

Proof of Lemma I3.lt The proof makes use of various generating functions, 
for which we need a few definitions. Let Xn denote the set of Dyck paths of 
length 2n. For a trajectory x € Xn, we define 

■= tt{i g]0, 2n[, x{t) — 0}, Ox ■= tJ{ odd marked instants of a; } 
s-x ■— ii{ even marked instants of a; }. 



Introduce the generating functions 

^^(^i,7,z) = ^i^ ^ <-7-^^z", K{z) :=^(n+l) ^ 7" 

n>0xexn n>0 x^Xn 

Then the function 

H{z):=Fi7r,,j,z)K{z) (27) 

is "almost" the generating function associated to the terms a„ and thus to those 
occurring in (j24p . Indeed in the definition of a„ as in (j24p . we have instead 
of 7. This will have no impact on the following reasoning as limjv— ►oo In = 1 
and we are interested in large A'^-asymptotics. Thus expanding _ff as a power 
series H{z) X]^o ^n^", one has that Zi = 0(1) x a'^^. 
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In order to determine the asymptotics of a„, we now turn to the evaluation 
of the generating functions. This is the aim of the subsequent lemma. 

Lemma 3.2. One has that 

9 



2/7ri - (1 - 7-i)z - 1 + ^(l + (l-7-i)z)2-4z 
and 

K{z) = I ^ ((1 - + 1 - v/(l + (1- 7-1)^)2 -4z) . 

Proof of Lemma 13. 2t We need to define two auxiliary generating functions 
to prove Lemma 1321 Set 



Then, decomposing any trajectory x £ Xn when n > according to the first 
return to the origin, one deduces the following relations: 

G(7, z)^l + zG{-f, z)G{7, z), 6(7, z) = 1 + 7^^20(7, 2)6(7, 2), 
F(7ri,7,z) = TTi +7rizG(7,z)F(^i,7,z). (28) 

Solving these equations yields (see 1571) that F(7ri,7,z) = ^—77; 7j 

f — 7ri2;G(7, z) 

where 

(1 - l-^)z + 1 - V(f + (1 - 7-i)z)2 ~ 4z 
G(7,2) = . 

For the evaluation of K , it is enough to observe that 

K{z) = z ^(n + 1) 5] 7-°^ 2" = 2^ E E 

= z^izG{^,z)). (29) 



This finally yields Lemma 1X21 □ 



Thanks to Lemma [3.21 one shall then deduce the asymptotics of a„ as n goes to 
infinity from the generating function H{z) := F{'Ki, 7, z)K{z). Set U = zG{'y, z). 
It can be deduced from [27] (pp. 21-22) that U is holomorphic in the disk 
{z, |z| < cr^/u+l, and one has that z = ■ Furthermore, z = if 

U — 0. Assume first that tti < Wc- One thus has that 

2„_ - / 1 £(^G(7,z)) 



~ 2nT Jc 2" l/TTi - zG(7, z) ' 
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where the contour Co encircles 0, is oriented counterclockwise and lies in the 
disk {z, \z\ < a"^ /u+}. By a straightforward change of variables, one gets that 

2i7r Jc I/tti - u \ u{u - 1) / 
where C is a symmetric contour encircling and remaining on the left of (t/^u+. 



Note that this implies that the contour C cannot encircle I/tti. It is then an 
easy saddle point argument to check points (m) and (in): the critical point is 
Uc '■= 1/wc and the saddle point contour is modified in a neighborhood of width 
i/y/n of Uc so that C remains to the left of Uc- 

When TTi > Wc, the contour C does not encircle I/tti and a straightforward 
Laplace method leads to point (i). This finishes the proof of Lemma [3?T1 □ 

We now turn to estimating the contribution of paths P with 1-edges seen only 
twice and which give a fundamental glued path P' by erasing a positive number 
of 1-edges (precisely 2(s — /) with our notations). We call Z2 this contribution. 
Note that the glued paths P' to be considered here are such that m = / with 
m < s. 

Proposition 3.2. There exists a constant C > such that Z2 < CZi. 

The proof will make use of the following extension of Lemma 13.11 In the 
next lemma, we write = a[^[Tri] (recall that ~ a^'"'an with a„ given by 

m)- 



Lemma 3.3. Assume that tti < Wc- Let C > be some constant independent of 
N . Then for all large N , and as long as sj^ = 0{N'^^^), there exists a constant 
C depending on C only such that a(.^ [TTie*^"*"/^] = a'^^ [ttiKC" + o(l)). 

We skip the proof of this lemma which can be obtained by the same saddle 
point argument as in Lemma [XT] Wc now turn to the proof of Proposition 13. 21 

Proof of Proposition 13. 2t Let P' be a fundamental path. It can first be 
deduced from Lemma 12.11 (applied with I = m) that the number of paths P 
which are preimages of P' and contribute to Z2 is at most 

Si -7 (30) 

(s — m)\ 

where Si is such that the first return to zero of P' occurs at time 2si. We 
already know from ^20j and [26j that in typical paths P' no edge is read more 
than twice. As it is assumed that P has no 1-edges seen 4 times or more, the 
expectation of P is just (ttict^/p)''"™ times that of P' . Using the inequality 
N(sAf — (s — m) , k, m — 1) < N(sAr, k + s — m, s — 1) and setting s' = s — m> 1 
and k' = k + s' , one can check that there exists a constant C > (whose value 
may vary from line to line) such that 

SN SN—Sl S— 1 SJV 

Z2 < ca'^^Y. E EE E ^iN(si-i,.i-fti) 

Si — 1 s — 1 rn — lk—lki<k' 
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(\ s—m 
2SSN_\ 



N{SN -is-m)-si,k-ki,m~ 1) \ " ' ., tt^^ 7^"'" (31) 

(s — mj! 



sjv sn — Si sjv siv 



< ca^^-E E E E E ^iN(.i-i,.i-fco 

si=l s=l s' = l /c'=s' + l fei<fc' 



< Ca'^^Y. E E E ^iN(.i-l,.i-fci) 

si = l s=l k' = lki<k' 

N(sjv - Si, fc' - fci, s - 1){ exp (CssAr/iV) - 1} 7^'-^". (32) 

In ([3T|l . the factor {2ssn/pY '^^^ be deduced from (|30p and the fact that the 
rescahng factor p^" sphts into p''™"* p'' . In the case where tti > Wc and sat = 
0{VN), we then readily get the result. For the case where tti < Wc and sat = 
0{N^^^), the conclusion follows from Lemma [5T51 □ 

Amongst the paths P associated to a fundamental glued path P' , there re- 
mains to consider those with some 1-edges seen four times or more. By definition 
of the gluing procedure, these edges must be erased and appear at most twice 
in P' . Thus, as for Z2, one has that m = I and s > m (and there are at most 
(s — m) 1-edges seen 4 times or more in P). We call Z3 the contribution of these 
paths and we show that those contributing to Z3 in a non-negligible way have 
1-edges passed at most 4 times. 

Proposition 3.3. There exists a constant C4 > which depends on the fourth 
moments of the entries of X such that Z3 < 6*4^2. 

Observe that Z3 is non-negligible when tti > Wc which partly explains the 
added constraint on the fourth moments of the X^'s to get the announced 
universality in cases tti > Wc and tti = Wc- 



Proof of Proposition 13. 3t One already knows from [20] and [261 that in 
typical paths P' no edge is read more than twice. Yet, in paths contributing 
to Z3, there exist some vertices that occur more than twice on each of the top 
and bottom lines. Thus choosing s — m moments of time in P' (to reconstruct 
P) can result into a 1-edge which is read more than twice in P. Note that by 
definition of clusters, such an edge can only be read inside one cluster in P. We 
now estimate the expectation of the path P with respect to that of P' . We call 
the contribution to Z3 from paths P with 1-edges read 4 times at most and 
Z^ denotes the remaining contribution to Z3. We show that Z3 is of the order 
of Z2 while — 0(1)^2. Our reasoning is mainly based on several properties 
of the gluing procedure which has been defined in Section above. 

Consider a 1-edge e = ( ^ ) which is read at least 4 times in P. Assume it is 
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read 2yi > 4 times in P which, using (H2), imphes that its expectation is at 
most {Cyi)y^ for some constant C > independent of N . 

f** case: The edge e does not coincide with any of the edges of P' . In other 
words, e is distinct from the edges starting the different clusters in P' . This 
means that amongst the s — m instants in P' where we have erased 1-edges (that 
is the set T in Section [2.3p . we have chosen yi times edges with vertex v on the 
top hne. These choices spht the subpath Pf (derived from the cluster Si by the 
gluing procedure) into ?/i + 1 subpaths as follows: 



(33) 



Denote now by Qj,j = 1, . . . , j/i — 1, the subpath starting with the edge 



V 

and ending with the edge ( " ) • Let also Qo (resp. Qr) be the subpath start- 

ing with (resp. ^) and ending with (resp. {^^)- Then each 

path P' which is obtained from P' by permuting any of the Qj,j = 1, . . . , j/i — 1 
leads by permuting the paths Pi to the same path P. Thus, the number of 
preimages of such a path P' has to be divided by a factor [yi — 1)! since each 
preimage is counted {yi — 1)! times when considering all the possible paths P' 
(recall the proof of Lemma [2.ip . Taking into account the expectation of the 
edge e in P then adds a factor {Cyi)y^ /{yi ~ 1)! < C?'i . 

Let us count now the number of ways to select s — m moments of time in such a 
way that we define yi times the same edge e. For this, assume that the instant t 
where Qi begins in P' has been chosen. Then two situations may happen when 
choosing the instant t' where it ends. To explain this, we need to introduce two 
characteristics (already mentioned in Section [221) of the path P'. The first one 
is vm = vn{P'), the maximal number of vertices that can be visited in P' at 
marked instants from a given vertex different from the origin 1. The second one 
is Tn — Tn{P'), the maximal type of a vertex in P'. We shall use the following 
fact deduced from the very definitions of i'n and Tn '■ given a vertex (different 
from the origin 1) occuring in P', it appears at most Tn + vn (resp. Tn) times 
as endpoint (resp. right endpoint) of up steps. It is then not hard to see that a 
given vertex (distinct of the origin 1) appears at most 2{Tn + vn) times along 
the path P'. Thus, the number of ways to select yi times the same vertex v 
when choosing (in P') the s — m moments of time does not exceed: 

] X [sN - s + m + yi) X [ 
s-m-yij \ yi-1 



for some constant C > 



SN \ f 2{Tn + vn] 



s — m J \sn — s 
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2"-'' case: The edge e coincides with one of the edges of P' . In this case, the 
above reasoning on the permutation of the subpaths Qj still applies. One then 
needs to distinguish two cases according to the value of yi . 

Case (a): t/i = 2. Then the edge e is seen exactly four times in P. To 
determine e one has to select one of the m edges starting a cluster. This de- 
termines the vertex v. The occurrence of v along the path P' where we have 
erased e has then to be determined. In principle, there are at most + Tjv 
possible choices for this occurrence. Nevertheless it is the most probable that 
m{TN + vn) » Sn due to the fact that m may be large. Thus, calling on the 
characteristics and Tjv does not improve the estimate and it is sufficient to 
notice, as in Lemma |2. 11 that the number of ways to choose the s — m moments 
of times to determine the erased 1-edges is at most of order 

Sn 
s — m 

that is exactly as for Z2. 

Case (6): t/i > 2. In this case, once v is determined (with at most Sn ways 
to do so), there are at most (^^"^"^2"'') possible ways to select yi — 2 other 
repetitions of v in P' . Thus, the number of ways to select the s — m moments 
of time to determine the 1-edges in this case does not exceed 

\s — mj \sn — s + m J 
We can now conclude that is negligible. Indeed, we deduce that 

^, X SN - s + m ) 
yi>3 

since for typical paths P', one can show that sat — s + to 00 (using (|3ip 
and the above) and that (Tat + i^at)^ << sn — s -V m (this point, recalled in 
Section [2T2l follows from 20J, Section 3.2). Similarly we can also show that the 
contribution of paths P with 1-edges seen 4 times in P but that do not arise in 
P' (this corresponds to the situation of the previous 1'"''* case with yi — 2) is 
negligible with respect to Z-i and does not contribute to the expectation 
Last, it is not hard to see from Case (a) that Z3 is of the order of Zi. More 
precisely, for each edge e seen four times in P and twice in P', one has to 
multiply the expectation of P' by ■k-^\X\^\'^ j (cP'y) at most to get that of P. In 
the sequel we set := max^, E|Xi„|'*/(t^ -|- 1. As s — to counts the number of 
instants in P where a 1-edge has been erased, one has that (compare with ((3T|) ) 

^3< Ca^^^g E EE E ^iN(si-l,si-fci) 

Si — 1 s—1 ?n — 1 A;— 1 fci<A;4-s — m 

N(s7V - (s - to) - Si, A: - fci, to - 1) \ \, -^l^- (35) 

(s - to)! 7/ 



^3°<2^.Ef^?^^7^)""-(l)^- (34) 
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We then conclude (using Lemma 13.31 in the case where tti < Wc) that there is 
another constant C'^ depending on the fourth moments of the Xij^s such that 
Z3 < C'/^Zi. This finishes the proof that typical paths P may have edges seen 
4 times but not more. Note that this happens when tti > Wc only, and in this 
case their associated path P' has no (l-)edge seen more than twice. □ 

3.2 Edges shared by clusters 

In this section we investigate paths P such that the clusters do share some edges 
in such a way that the 1-edges in the glued path P' are not necessarily read at 
moments of time where the trajectory goes back to the level 0. Keeping the same 
notations as before, the length of the path P' is now 2(sjv ~ (s ~ 0) ' ~ ™ 
returns to the vertex 1 on the bottom line of P' occur at some positive levels. 
To consider such paths, we define a second gluing procedure and associate a 
second path P" to the initial path P. This gluing procedure is close to the 
construction procedure already used in [25] and [26j . 

For short, we call Qi instead of Pf the subpaths in-between two returns to 
the vertex 1 on the bottom line of P' (recall Section [^?^ . We let ii < Z — m — 1 
be the smallest index where the first return to the vertex 1 on the bottom line 
occurs at some positive level. Then there exists an edge which is opened but not 

closed in Qi^ . We denote by e = the first of these edges. When reading 

P' , let then 12 be the lowest index such that the edge e is closed (and odd) in 
Qi^ . Let then e be the first edge in Qi^ occuring also in Qi^ . Note that it may 
happen that e ^ e: this arises in non typical paths only, as this implies that 
P' has edges seen at least 4 times. Let also te and t'^ be the instants of the 
first occurrence of the edge e in Qi^ and Qi^ respectively. We then define the 
path Qi^ V obtained by the gluing of the two subpaths by erasing the first 
occurrence of the common edge e in each of the subpaths as follows. We first 
read Qii until the left endpoint of the edge e at time te- Then we switch to 
Qi2 in the following way. If te and t'e are of the same parity, we then read Qi^ , 
starting from t^, in the reverse direction to the origin and restart from the end 
of Qi^ until we come back to the instant t'^ + 1. If te and t'^ are not of the same 
parity, we read the edges of Qi^ in the usual direction starting from t'e + 1 and 
until we come back to the instant t'^. We have then read all the edges of Qi^^ 
except the edge e occurring between t'e and t'^ + 1. We then read the end of 
Qii, starting from te + 1. Having done so, we obtain a path Qi-^ V Qi^ which 
has the same final (and first) edge as Qii and the vertex 1 is marked once on 
the bottom line. We then set P{ to be the path defined by 

gi U . . . U Q^,-l U Q.,, V Q^, U Q,, . . . U Q^^. 

Here the hat means that the corresponding term does not appear. We then 
replace P' with P[ and restart the same procedure. We call 1 < g < I — rn 
the number of gluings needed so that all the occurrences of 1-edgcs correspond 
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either to a marked instant or to a return to of the associated trajectory. Note 
that g is defined hy I — m — g = ttl? where I? is the set of the subpaths Qi 
which are sub-Dyck paths of origin 1 with all their edges even and which occur 
at some positive levels in all the successive P/'s. Note that through the gluing 
process, such subpaths are not modified but are moved to the level in the 
order they appear. We denote by P" the path finally obtained after g steps of 
the gluing procedure. By definition of this gluing procedure, P" is of length 
"^{sn — {s — I) — g) with exactly m! — I — g returns to the level 0, its origin is the 
vertex 1 and is marked I — m'{= g) times on the bottom line. In the following, 
we denote by k the number of odd marked instants in P" . 

We shall now estimate the number of preimages P of such a path P" as well 
as their expectation. The first and main work here is to investigate the step 
from P" to P' . Once this is done, it will be quite straightforward to estimate the 
number of preimages P of such a path P' and their expectation by extending the 
analysis made in the previous subsection. To reconstruct P' from P" , we have 
to "recover" each of the g glued subpaths Qi^. Thus, for each glued subpath 
Qix V Qi^ , we have to find the instants where Qi^ begins and ends (that is the 
two instants of switch from one path to the other); we also need to determine the 
direction in which Qi^ is read as well as the origin of Qi^ in P" . Actually, the 
origin of Qi^ is just given by the marked occurrence of the vertex 1 in Qi^ V Qi^ . 
Then, we shall take into account the weighted contribution to the expectation of 
each erased edge e. More precisely, one already knows that in typical paths P" , 
each edge appears only twice. But when rebuilding P' from P", it may happen 
that some of the erased edges e appear more than twice in P'. As we will see, 
such paths will lead to a negligible contribution to the expectation which 
will ensure the universality. 

Consider paths P having some non disjoint clusters so that some 1-edges 
in the associated glued path P' occur at positive levels. Denote by the 
contribution to the expectation (fT2|) from such paths P for which all the erased 
edges e between P' and P" appear exactly twice in P (or P'). 

Proposition 3.4. The main contribution to comes from paths P with all 
edges seen twice except 1-edges which possibly occur 4 times. And there exists a 
constant C'4 > depending on the fourth moments of the entries of X such that 
Z4 < C4 Zi . 

Proof of Proposition 13. 4t Here we only consider paths P" having all their 
edges seen twice since, as previously said, these are the typical paths. 

Let us first reconstruct P' from P". Due to the fact that the vertex 1 is 
marked g = l — m' times on the bottom line of P", the weighted number of such 
paths P" is at most of order 
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times the weighted number of paths where the origin 1 is non marked and 
with the same length, the same number of odd marked instants and the same 
number of returns to 0. Then, by definition of the gluing procedure, the first 
step beginning the subpath Qi^ in Qi-^ V is up which implies that the total 
number of ways to choose the instants of initial switch is at most of (^^n-(s-i)"\^ _ 
Moreover, the number of possible choices for the instants where one switches 
for the second time from the subpaths Qi^ to the Qi-^ is at most '•'^ 
(the factor g\ comes from the fact that the Qi.-^ may be interlaced). It remains 
to choose the direction and the order in which the subpaths Qi are read. We 
claim that this yields a factor 2^(™^"^^) x 5!- Indeed, one can notice that once 
the Qia's are identified, the remaining m' + g subpaths are known i.e. one knows 
the Qi/s and the paths belonging to the set V. Moreover, by construction of 
the gluing process, these latter subpaths appear in the same relative order as in 
P'. To reorder the Qi's, one needs first to choose the place where one actually 
encounters the (unordered) subpaths that are glued and moved by the gluing 
process i.e. the Qi^ (note that this also reorders those belonging to the set V). 
There are at most ways to do this. Last, the previous term g\ counts the 

number of ways to reorder the subpaths Qi2 ■ Regarding the respective weights 
of the paths P" and P' , one has to take into account the erased edges. As we 
assume that the erased edges are pairwise distinct and read exactly twice in P, 
the weight of P' is of order Ns that of P" . 

Now, one has to reconstruct P from P' . In fact, this is really close to the 
analysis made in the previous Section 13.11 Indeed, the upper bound on the 
number of preimages P of a path P' obtained in Lemma 12.11 does not use the 
assumption that clusters are disjoint or not. Thus we deduce from Lemma l2. II 
that the number of preimages P of a path P' is at most si (j) (2sAr)'*^', if the 
first return to of the trajectory of P' holds at time 2si. 

We are now in position to estimate the contribution Z^. Observe first that 
N(sAr - Si - [s -I)- g,k- ki,m' - 1) < 'N{sn - si, fc+ {s - I) + g - fci,s - 1) 
(recall that m' + s ~ I + g — s). Hence, letting k' ~ k + {s — I) + g and using 
the fact that m' = I — g, by computations similar to those made for the Zi^ 
? = 1, 2, 3 in the preceding section, one has that 

< Ca^"" J2 J2 X siN(si-l,.si-fci)N(sAr-si,fc'-A:i,s-l) 

Si = l s=l g,fc',fci 

where we used the fact that I < s. C and C4 are positive constants independent 
of N but C4 depends on the fourth moments of the entries of X; C'^ > is 
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another constant depending on C^. In the case where sn = 0{\/N), one can 
readily see that 



Si — 1 s—1 k' ,ki 

— C'^Zi. 

In the scale sm ^ N^^^, the above estimate needs to be refined (since >> N). 
In fact, we can improve the bound on the number of choices of the starting 
instants of the subpaths Qi^ . To see this, call t'l < t'2 < ■ ■ ■ < t'g the instants 
corresponding to the end of the Qij's. Denote by ti < t2 < • • • < the 
instants beginning the reading of the Q'i^s. By definition of the gluing process, 
each edge started at the instant ti is an up edge. Furthermore, if x" denotes 
the Dyck path associated to the path P" , one has for any i = 1, . . . ,g that 
x"{t) > x" [ti] > 0,Vt e Thus, the interval [ti,t^] is included in one sub- 

Dyck path of x" . We claim that the total number of ways to choose ti and t'^ 

does not exceed Cs^i'^ for some constant C > 0. Indeed, choosing t'^ determines 
the sub-Dyck path of x" containing [ti, t'^]. We call Xj this sub-Dyck path, 2Lj 
its length and kj the number of its odd up steps. Our estimate is obvious in 

the case where Lj < s]^^ . If Lj > s]^"^ , we call iV(t^) the number of ways to 
determine ti. Let ^Lj,kj denote the expectation with respect to the uniform 
distribution on the set XLj,kj of Dyck paths of length 2Lj with kj odd up steps. 
Then there exists some constant C > independent of N, kj and Lj such that 
(for typical fcj's) 

EL„fc, < C. (38) 

The above bound essentially follows from the estimation obtained in Section 2.5 
in [20]. Indeed, setting TQ^n,k 4kXn,k for any n, fc, one has that 



n,k' 



The term 4inf{n, {Lj — n)} counts the number of ways to choose ti once given 
t'^ and 2n which is the length of the sub-Dyck paths between ti and the first 
return to x{ti) followed by a down step. Given g > and a Dyck path X of 
length 2L, we set 

K%\X):= Yl f[md, 

l<t[<t'2<-<t'g<2L 1=1 

where the sum bears on t'^ such that X{t'^) > 0,Vi — I,. . . ,g. Similarly and 
using the Appendix in one can show that there exists a constant C > 
independent of iV, kj and Lj such that (for typical fcj's) 



■^kA .La 



{X,)] <(Cs%^y . (39) 
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In the sequel, we call Xs„-si,fc'-fci,s-i the set of Dyck paths x in Xs„-si,fc'-fci 
with s — 1 returns to 0. Let E,k>^s,SK denote the expectation with respect to the 
uniform distribution on Dyck paths of length 2s n with s returns to the level 
and k' odd marked instants. Let xi be the Dyck path defining the first return 
to of x" . The estimate ([55)) may then be replaced by 



Sn SN — si s I 



si=l s=l i=l 3=0 fc',fci xiexsi-l.si-fei £6Xs„-si.fc'-fci,s-l ^ 



-^^"EEEEE E 



E 



K^'jxiUx) 
N 



We then deduce that there is a constant 6*4 > (depending on C4) such that 
gni) < C^Zi. This ends the proof of Proposition [3ll □ 

Remark 3.1. Note that we have also shown that paths P' having at least one re- 
turn to the vertex 1 on the bottom line at some positive level lead to a negligible 
contribution in any scale sjv << N^^^- This follows from ([59)1 . 



To complete the analysis, there remains to consider paths P leading to a glued 
path P' having some 1-edges which occur at positive levels and for which some 
of the erased edges e in-between P' and P" appear 4 times or more in P (or in 
P'). We call Z5 their contribution to the expectation (|12p . 

Proposition 3.5. One has that Z5 = o{\)Zi. 



Proof of Proposition 13. 5t Consider the set of paths P such that the edges 
which are erased in-between P' and P" and which arise at least four times in P 
do not appear in P". We call Z5 their contribution to Z5 and set Z5 := Z5 — Z^. 

We first show that << Zi. Let e = be one of the erased edges 

in-between P' and P" which appears 4 times or more in P' (or in P). Denote 
by rie > 1 the number of times where e is an erased edge in the gluing process. 
We here assume that P contributes to Z^ so that e appears exactly 2ne times 
in P. Thus there are Ue pairs of paths {QinQi^) in P' such that the derived 
glued subpath Qi^ V Qi^ is associated to the same edge e. 

Let us first prove that this decreases the number of preimages of the path P" 
by a factor of 



Sn 



(41) 



where the quantities vj^ ~ iyj\i{P") and Tjy = T]y{P") have already been defined 
and used in the proof of Proposition 13.31 For this, consider the first (resp. 
second) of the previous considered pairs (QiuQia) ^^'^ denote by ig.i (resp. 
te.2) the instant where Qi^ begins and by t'^ i (resp. t'^ 2) the instant where Qi^ 
ends in Qi^ V Qi^. Suppose now that the instants ie,i and i have been chosen 
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and that the vertex a (resp. /3) occurs at time te,i (resp. t'^ -^ (the other case 
can be handled similarly) . We claim that the number of choices for the instant 
2 is at most of 2{yf^ + Tjv) instead of sat. We recall that the quantities vj^ 
and Tjv are such that, given an arbitrary vertex ^ 1 in P", there are at most 
i^AT + Tjq up steps having v for endpoint. The announced bound readily follows 
since the edge started at i'g 2 has a or /3 (already determined by the choices of 
te,i and as left endpoint. Obviously, the reasoning also applies to the rig — 2 
remaining glued subpaths Qi^ V Qi^ having e as associated erased edge. 
We now consider the weight of the path P with respect to that of P" . The 
weight of the erased edge e must multiply that of P" : we use the fact that 



Combining all the preceding, we deduce that is of order 

n>l ^ ^ 



is thus negligible with respect to Z4 (and so Zi) since vjq + Tjv << v^sjv in 
typical paths P" (cf. [20], Section 3.2). 

We now estimate Z5. Consider a path P which contributes to Z5 and let e 
be an erased edge in-between P' and P" which appears 4 times or more in P' . 
We also denote by rig > 1 the number of times where e is an erased edge in 
the gluing process. As in typical paths P" each edge is passed twice, e appears 
exactly 2(ne + 1) times in P . The previous reasoning works allowing rig > 1. We 
then see that the sole case which remains to be considered is rig = 1 that is when 
e occurs 4 times and not more in P . In this case, we determine the instants te^i 

and ig as in the estimate of Z4. This determines the edge e = . Now, the 

knowledge of e decreases the possible choices of the marked occurrence of e in 
P" . More precisely, one pays a cost of order [y^ + Tiq)"^ /sm so that a marked 
occurrence of (3 (for instance) arises in P" after an occurrence of a (see also 
[23], p. 13). Using the above it is not hard to deduce that « Zi. This 
finishes the proof of Proposition 13.51 □ 

4 Higher moments 

Let be a fixed integer. Let also Ci, i = 1, . . . if , be some positive real numbers. 
In this section, we compute moments of the type E^HiLi TrV^" ^ , where [s^^) 
are some sequences of integers such that Hmjv^oo SjvV-^^^^ — if t^i > Wc and 
limjv^oo / ^'^^^ ~ '^i if ""i — ^c- Then we prove the following result. Set 

y{G) y{G) 
y^) ^ _N_ < y(G) ^ .J ^ 
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Proposition 4.1. Under the assumptions of Theorems \1.5\ and \l . 61 there exists 
a constant C ^ C{K) > such that E^^H^i Tr V^" ^ < C and 

E (flTrvf^ ^ (1 + o(l))E (^nTr((^^)'" )) ' 

In the case where tti > Wc, the constant C also depends on maxjE,\Xij\'^. 

Proof of Proposition gUJ We consider the variance E (TrV^" - ETrV^")^ 
only. Indeed the computations needed to consider higher moments foUow from 
the same arguments combined with those developed in Section 5 of \25\ . Propo- 
sition [O] can then be restated as follows. 

Lemma 4.1. There exists C > such that E (ttV^" -ETrV^"^^ < C and 
one has E (w^" - ETrV^")^ = (1 + o(l))E (Tr(l/^)'" - ETr(y^) 

Proof of Lemma IHIJ Let us define Y := E^/^^. Then, 

" ' -E 





Here, given an edge e = {i,j) G (this is similar for P(2)), Yij stands for Yij 
if e occurs at an odd instant of P(i) and for Yji if it occurs at an even instant. 
The starred sum bears on paths P(i),P(2) of length 2sn sharing at least one 
common edge {i,j), i G [1, . . . ,N],j G [l,...,p]. This follows from the fact 
that the Yij^s are independent centered random variables. We say that such 
paths are correlated paths. The contribution to the variance from correlated 
paths without 1-edges can be deduced from 20J. We thus focus on the pairs of 
correlated paths with 1-edges and assume without loss of generality that P(i) 
has at least one 1-edge. 



We first consider the case where both P(i) and P(2) have 1-edges. We denote 
by Ti (resp. T2) the number of pairs of 1-edges in P(i) (resp. P(2))- We also set 
s = T1+T2. We build from P(i) (resp. P(2)) Ti (resp. T2) subpaths {P^)l<^<T^ 
(resp. (Pi)Ti+i<i<s) starting and ending with a 1-edge as in Section [2. 31 In the 
following, we use the denomination "1-subpath" or simply "subpath" of P(i) or 
P(2) to refer to some subpath Pi. By definition, the origin of P(i) (resp. P(2)) 
occurs at some even instant in the subpath Pi (resp. Pti+i). We concatenate 
the subpaths (Pi)i<i<s in the order they appear which leads to an even path P 
of length 4sjv 
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Case 1: The subpaths Pi in P(i) and those of P(2) share 1-edges only. Then, 
as in Section [2?3l we define the I < s clusters of P and we apply the first gluing 
procedure yielding a path P' of length 2{2sn ~ (s — I))- We denote by x' the 
trajectory of P' and by m the number of returns to of a;'. As P(i) and P(2) are 
correlated, one has that / < s. Here, we will also assume that P' is fundamental 
that is m = L Otherwise, this implies to perform the second gluing procedure 
on P' yielding a new path P" but as we assume that the paths P(i) and P(2) 
share 1-edges only, all the arguments we will give to determine P(2) from P' are 
exactly the same when dealing with P" (see below). Thus, focusing on P', s — m 
counts the number of pairs of 1-edges that have been erased through the first 
gluing process. For the sequel, it is convenient to denote by 2Lj, j = 1, . . . ,m 
the length of the successive m sub-Dyck paths of x' 2Lj = 2{2sn — {s — 1))). 
To reconstruct P(i) and P(2) from P', one has to determine the s ^ m instants 
of time where a 1-edge has been erased and reorder the subpaths thus defined. 
One also has to determine the origins of P(i) and P(2)- By construction, the 
origin of P(i) occurs at some even instant in the first 1-subpath in P'. We call 
te the first moment of time where a 1-subpath of P(2) is glued to a 1-subpath 
of P(i). We call Q the latter subpath of P(2). One can note that at time te, a 
1-edge which we call e is erased. Last, we let i/ be the instant where Q stops 
in P'. Two cases must be considered now since tf can be an instant where a 
1-edge is erased or where x' returns to 0. 

Assume first that tf is an instant where a 1-edge is erased. Assume that 
tf and all but te of the s — m — 2 other moments of time where a 1-edge is 
erased have been selected in P'. Assume also that the corresponding (s — 1) 
1-subpaths have been reordered. There are possible choices for the 

s — m — 1 instants of the erased edges and (s — 1)1 /ml ways to reorder the 1- 
subpaths thus defined. Indeed, the m subpaths beginning the sub-Dyck paths 
of x' arise in the same relative order in P' and in the concatenation P (cf. the 
proof of Lemma [2TT|) . We call P the path obtained after rearranging these s—1 
subpaths. We now choose along P the instant defining the origin of P(2) : this 
determines all the subpaths of P(2) except Q. There are at most 2sjv choices 
for to. A crucial fact now is that the knowledge of to combined with that of 
tf determines the instant te (in P) since as P(2) is of length 2s n, the length of 
Q is then known. To obtain the full path P(2) and the final concatenation P, 
it remains to insert Q in P; there are at most 2s ways to do this (the factor 2 
comes from the choice of the direction of reading Q in P). Set sat = 2sAr — 1 and 
C4 := 1 + max„ E|Arii,|''/cr^. Combining the whole, we get (for details, see the 
computations of Z2 and Z3 made in Section l3.ip that the contribution Z^-^ to 
the variance from correlated paths (P(i),P(2)) such that tf is an instant where 
a 1-edge is erased is at most (for some constant C > 0) 

4:i< Ca^'^E E EE E ^i^N(s,-l,s,-M 

si = l s=l m=l k=l ki<k+s-m 
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f 2C4SS,. 

N(5jv - (s - m) - Si , fc - fci , m - 1) ^ ^ nl (42) 

:= l^^a^- X A,^ (43) 

where we let s' := s — m — 1. The factor C4/N comes from the weight of the 
erased 1-edge e (it can indeed be shown that e can only occur at most four 
times in typical paths P). In the case where sat = 0{^/N) and tti > Wc, we 
readily deduce that (|42p / (rim 1)^^" is bounded (universality is discussed at the 
end of this section). In the case where tti < Wc and sat = 0{N^^^), it is a small 
computation, using the same arguments as in Lemma [331 to check that 

As^ - 0(1)-^ if ^1 < and = 0(1)^;" if tti = u.^. (44) 

Assume now that the instant t/ is such that x'{tf ) — 0. We then fix by 
choosing one such instant : this fixes some 1 < j < m. Assume that all but of 
the s — m — 1 other moments of time where a 1-edge is erased have been selected 
in P'. The knowledge of and of the s — m—1 selected instants determines 
the subpath Qo in P' which still has to be split into a subpath of P(i) and the 
first subpath, which we call Q, of P(2) that is glued to one subpath of By 
construction, Qq is included in the sub-Dyck path of x' ending at time tf so 
that the length of Qq is not greater than 2Lj. As before we reorder the (s — 1) 
1-subpaths thus defined to get the path P. Now given P, we claim that there are 
at most 8s N different ways to choose tf and the instant defining (along P) the 
origin of P(^2) ■ Indeed, in the final concatenation (that is in P), the origin of P(2) 
is encountered along the first subpath of P(2) ■ So in P, to is either in Q or in a 
1-subpath which begins in the interval of time [2s 2sn + 2Lj]. Denoting by 21' 
the length of the 1-path beginning in [2sjv, 2sjv + 2Lj] but which does not finish 
in this interval (if it exists), there exists some L'j such that 21' < 2L'y Hence 
the number of possible choices for tf and to is at most X^Jli '^^3 + < Ssat 
which is what we wanted. We then readily conclude that the contribution 

of such correlated paths (P(i),P(2)) behaves as since it is at most 4 times 
the r.h.s of (|42]). 



Case 2: The paths P(i) and P(2) share edges which are not 1-edges. We 

(2'] 

denote by { the contribution of such correlated paths (P(i), P{2)) to the vari- 
ance. Dealing with such a pair, we still apply the first gluing procedure on the 
concatenation P. If the path P' obtained in this way is such that all the 1-edges 
arise when the trajectory of P' returns to the level 0, we can finish the proof as 
before. 

Otherwise, we apply the second gluing procedure defined in Section [3^ getting 
a new path P" where each occurrence of the vertex 1 on the bottom line cor- 
responds to a marked instant or an instant where its trajectory x" returns to 
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0. We denote by m the number of returns to of x" and by 2Li, . . . , 2Lm the 
length of the successive m sub-Dyck paths of x" . Given P" , we shall now re- 
construct the paths ^"(1) and P(2) . As before, we call Q the first subpath of P(2) 
that is glued to one of P(i). We consider here the case where Q is glued using 
the second procedure which means that its gluing is associated to a marked 
occurrence of the vertex 1 in P" (since the other case can be treated using the 
arguments developed in the previous case). We also denote by (resp. tf) the 
instant of time where Q begins (resp. ends) in P" . We assume that all the 
instants needed to define the gluing but that of Q are chosen. We also assume 
that all the marked occurrences of 1, except that associated to the gluing of Q, 
are known. All these instants define (s — 1) 1-subpaths which we reorder as be- 
fore defining a new path called P. Now if one also knows the instant to defining 
the origin of P(2) in P s^nd if is fixed, then the length of Q is determined 
and there is no choice for tf. In the sequel we set Iq = tf — te- We consider 
the case where the cluster containing the subpath Q is well separated from the 
others (not interlaced). The other case follows from the same considerations. 
Then during the time interval [te,te + Iq], the trajectory x" of P does not go 
below the level x" {te) and there is also a marked occurrence of 1 on the bottom 
line. Furthermore, by the definition of the second gluing procedure, [t^, t^ + Iq] 
is included in a sub-Dyck path of P . Assume that this is the jth sub-Dyck 
path, which is thus of length 2Lj . Let also kj be the number of odd up steps in 
this sub-Dyck path. Denote by Nt^ the total number of possible choices for the 
instants t^ and that of the marked occurrence of 1 associated to the gluing of 
Q. Let ELj .fcj denote the expectation with respect to the uniform distribution 
on the set XLj,kj of Dyck paths of length 2Lj with kj odd up steps. Then there 
exists a constant C > independent of TV, kj and Lj such that (for typical fcj 's) 



E 



.,,k,[N,Js%^)<C. (45) 



The above bound essentially follows from arguments close to those used in 
and the estimation obtained in Section 2.5 in 20J. More precisely, set- 
ting Tb.n.fc '■— 4I^Xn.k for any n, fc, it is easy to show that 



n,k' 0,L,-,fc,- 



where n (resp. Lj — n) counts the number of possible choices of the instant of the 
marked occurrence of 1 (resp. of the instant t^) if the sub-Dyck path between te 
and the first return to x{te) followed by a down step is of length 2n. The above 
estimate clearly holds if Lj < sj/^ and if Lj > s]^^, one can copy the arguments 

(2) 

of Section 2.5 in [20]. We are now in position to estimate the contribution {. 
To this aim, we denote by Z'^ the contribution Z4 of Section [3?2l corresponding 
to even paths of length Asn instead of 2s n- Apart from Nt^, one needs to 
multiply the contribution of P" by a factor of the order 16cr^ssAr/iV^. Indeed, 
the number of ways to determine the sub-Dyck path of P where [te, te + Iq] is 



37 



included and the origin of P(2) in P" may be controlled as before by a factor 
Y!^^=i '2Lj + 2L'j < 8sn- Besides, there are at most 2s ways to insert Q in P and 
choose its orientation. Last, due to the edge e erased at time in between P 
and P" (it can be shown that e does not occur in typical paths P" and occurs 
twice in typical paths P) and due to the marked occurrence of 1 associated to 
te and tf, the weight of the path has to be multiplied by a factor cr^/p x 1/A''. 
Hence, inserting the factor a'^ssNNt^/ (pN) in the computations of Z'^ and using 
(giD and (gH), leads to 

2 3/2 

A2) ^ <^<^ gjV ^ 7(1) 



r^y^, ^ "iV y y\ 



(2) 

for some positive constant C. One can then check that { = 0{1)Z'^. 

To complete the analysis, we now investigate the case where P(2) has no 1- 
edge. In this case, we first apply the first gluing procedure to P(i), which leads 
to a path called P^'^^. Then we use the second gluing procedure to "insert" P(2): 
we consider the first edge along P'^^^ which is also encountered along P(^2) and 
use it to glue P(2) by the construction procedure used in Section [3T2l Last we 
use the second gluing procedure (if needed) to obtain a final path where all the 
1-edges arise at level of the associated trajectory or correspond to a marked 
occurrence of 1. The procedure we use in this case can be compared to that of 
Case 2, provided the path P(2) is "assimilated" to a 1-subpath. The analysis 
performed in Case 2 can be copied up to minor modifications to show that the 

(2) 

contribution to the variance of such correlated paths is of the order of { . 

Combining all the preceding implies that the total contribution to the variance 
Var( TrV^*") from correlated paths (P(i),P(2)) is bounded. ((44|) also imphes 
that in the case where tti < Wc, the contribution of paths with 1-edges is negli- 
gible in the large A^-limit. 

To conclude to universality of the variance, we can use the fact that only the 
pairs of correlated paths with 1-edges seen at most 4 times and other edges 
passed exactly twice contribute in a non negligible way to ()42|) . Thus universal- 
ity of the variance (and higher moments) can be deduced from universality of 
the expectation of traces of Vj^*" . This finishes the proof of Proposition [13] □ 



5 More than one eigenvalue greater than 1 

In this section we consider more general spiked sample covariance matrices (Vn) 
given by ([3]) with a spiked covariance matrix S = diag(7ri, 7r2, . . . , tt^, 1, . . . , 1) 
where r > 2 is some fixed integer independent of p and N and tti > 7r2 > ■ ■ • > 
TTr > 1 are given real numbers independent of p and N also. 
We shall explain the main modifications to be made in the previous analysis in 
order to prove Theorems 11.51 and 1 1 . 61 in this more complex case. As in the case 
where r = 1 (recall Section [L2l which includes the case where r > 1), one has to 
prove boundedness and universality of moments (of any fixed order) of traces of 
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high powers of Vn- We here restrict ourselves to the study of the expectation. 
Universahty of moments of higher order of traces of Vn then foUows from the 
same arguments as in the case where r = 1 (see Sectional). 

As before, (sat) denotes a sequence of integers that may grow to infinity. In 
order to examine the contribution from paths P to the expectation E(TrV^"), 
one has to consider the number of times each of the vertices 1, . . . ,r occurs on 
the bottom hne of P. To fix the idea of the analysis, we consider the case where 
r = 2. The general case then follows from a straightforward extension of the 
arguments used when r = 2. 

Let then a path P of length 2sn contributing to E(TrV^") be given. We 
assume that P has Ti (resp. T2) pairs of 1-edges (resp. 2-edges) with T1+T2 > 1. 
We set s := Ti + To deal with such a path, we define a glued path P' . The 
gluing procedure (leading to P') defined in Section [2?3l when P has only 1-edges 
(or only 2-edges) is modified in the following way when T1T2 > 0. We first 
identify the instants ti < t2 < ■ ■ ■ < ts where the first edge of pairs of 1-edges 
or 2-edges occur in the path. We call ef'' (resp. e[^'') the left (resp. right) 
edge of these s pairs of edges. Then, for i > 2, we define the subpath Pi as the 
subpath starting with e'll-^ and ending at e'f K As before Pi is the path starting 

at ei^^ and ending at e^'"* (we concatenate the end and beginning of P). Two 
subpaths Pi and Pi' are now said to belong to the same "connected component" 
if they share a 1-edge or a 2-edge. We denote by / (/ < s) the number of such 
connected components. Consider the first connected component and denote by 
li its cardinality. We claim (since each of the 1- or 2-edges occurs an even number 
of times, see Subsection l2.3.2l) that there exists a way to glue the h subpaths in 
order to form a path satisfying the following conditions: it starts and ends with 
the same 1-edge or 2-edge and has no other 1-edge or 2-edge. We do the same 
for the other components in such a way to define I paths (corresponding to each 
connected component) which have pairwise distinct first edges and appear in the 
same relative order as in the initial path P. We denote hy Qj,j = 1, ... ,1 the 
successive paths derived from the gluing process (in case r = 1, we denoted them 
by Pj). To each path Qj, we associate its connected component (also called 
cluster) Sj,j = l,...,l, which is the set of initial subpaths that have been glued 
to form Qj. We then obtain a "path" P' of length 2{sn — (s — l)) with origin 1 or 
2 and having m returns to the level 0, for some m < s. Note that (as T1T2 > 0) 
P' is not a path in the usual sense, since one might switch from vertex 1 to 2 at 
any instant where one switches from one cluster to another. Nevertheless each 
cluster (or subpath Qj) starts and ends with the same vertex. We start with the 
following important remark. Assume that the clusters Sj and do not have 
the same origin and that, for instance, Sj+i starts with a 2 (the reverse case is 
similar). This necessarily implies that some subpaths starting or ending with a 
2 have been glued in some preceding clusters. In other words, if we denote by 
K the number of times one switches the origin of successive clusters, one has 
that K < s-l. 
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We first assume that I — m that is the returns of the trajectory associated to 
P' to the level define the I — m clusters. Here we show that the contribution 
of paths for which T2 > is negligible if tt2 < tti. Their contribution is of the 
same order as that of paths with only 1-edges (and only 2-edges) in the case 
where tti = tt2- 

One of the main points in the analysis is to estimate the number of preimages 
P of a glued path P' , that is to establish the counterpart of Lemma [2. II The 
number of ways to determine the set T of the s — m moments of time where some 
1- or 2-edge has been erased is at most as before. Yet the number of ways 

to reorder the subpaths Pi thus defined is much smaller than in the case where 
r = 1. When r = 1, we used (recall the proof of Lemma [TT]) the upper bound 
_s!_ ^ When r ~ 2, there are some constraints on the way to reorder the 

subpaths Pi: they must be reordered in such a way to form a path. Indeed a 
subpath starting with a 1-edge (resp. 2-edge) cannot follow a subpath ending 
with a 2-edge (resp. 1-edge). When r = 2 (and T1T2 > 0) and assuming that 
the origin of P' is chosen, the maximal number of ways to reorder the subpaths 
Pi if one does not take these constraints into account is bounded by s*~™: for 
each cluster Sj , it is enough to indicate the number of "slots" between the first 
subpath and each of the subpaths of Sj. Let us call R the number of ways to 
reorder the Pj's in an admissible way now. Then if T1T2 > 0, one has that 

R < Ss''-™-^ (46) 

To prove this, we need a few notations. We call xi (resp. X2) the number 
of subpaths Pi starting and ending with a 1-edge (resp. with a 2-edge). And 
2x3 '■= s — xi — X2 denotes the number of paths with both a 1-edge and a 2- 
edge. It will be convenient to call these paths respectively 1-paths, 2-paths or 
12-paths. We here consider the set of paths for which Ti > T2 and which are 
obtained from an admissible configuration of the P^'s. Assume that X2 ^ 0. If 
xi > X3, consider all the configurations obtained by permuting one of the Xi 1- 
subpaths with one of the X2 2-subpaths. Then distinct admissible configurations 
lead to distinct non admissible configurations. Similarly, if 2:3 > xi, we consider 
all the configurations obtained by permuting one of the X3 1 2-subpaths with one 
of the X2 2-subpaths. If now X2 = 0, we consider all the configurations obtained 
by permuting one of the X3 12-subpaths with one of the xi 1-subpaths. In all 
these cases, the number of permutations is at least s/8 which leads to (|46|) . 

From now on, we assume that there exists a real number c > such that 
lim N sn/ — c if TTi > Wc and lim n sn/ N'^^^ = c if tti < Wc- 
In the following, we focus on the estimation of the contribution from paths P 
having 1-edges and 2-edges passed only twice. As in the case r — 1 (and calling 
on [50]) it is not hard to see that, amongst the associated glued paths P', the 
typical ones have edges passed at most twice. 

Thus, the contribution of such paths P with Ti pairs of 1-edges and T2 pairs of 
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2-edges (Ti > T2 > 0) can be bounded from above by 

Sn sn~si s m 



si = l s=0 m=l K=l ^ ^ k.ki 



SiN(si — 1, Si — fci)N(sAr — Si — (s — m), k — ki,m — 1) 



^7jv \ ( Sn - (s - m) - I 



N J \ s ^ m 

where the extra factor (™) comes from the fact that we have to distribute the 
m starting points of clusters into those starting with I's and 2's (and C is a 
positive constant whose value may vary in the following). The contribution of 
paths for which T2 > Ti > can be analyzed in a similar way. One simply 
interchanges the role of xi and X2 in the previous reasoning. 

Sk=i l-ff<s-m(^) < 2™, it is clear that the contribution of the paths P 
such that m < 100(s — m) (100 is an arbitrarily large constant here) yields a 
contribution which is at most in the order of 
100 „ 

^-^^"E E EEw^Qe 

si=l s=0 m=lK=l ^ ' k.ki 



SiN(si — 1, Si — fci)N(s7v — Si — (s — m), k — ki,m — 1) 

27jV^ ^ " ™ /^STV - (s - m) - 1^ ^^^^s-m„fc+(s-m)-s„_sl 



N V s-m ; ' ^ ^s ^ 

T-2<s 



0(^)Zi if^2<7ri, 

0{l)Zl ifTTa^TTl. 



(48) 

= TTl. 

where Zi is given by p4)) . 

There now remains to estimate the contribution of paths P for which m > 
100(s — m). To this aim, we need to refine our preceding reasoning. Assume 
that the s — m moments of time of the set T as well as the K instants where 
one switches the origin of the clusters have been selected. Assume also for ease 
that the origin of the path P' is 1 and denote by mi (resp. 1112) the number 
of 1-paths (resp. 2-paths) starting a cluster. Last set ma — m — mi — m2- To 
reorder the P^'s, we first reorder the 2x3 1 2-paths. There are ^^^^ ways to 
do so. Then we determine the number of 1-paths and 2-paths to be inserted 
in-between the 12-paths and reorder them. There are at most ^72''"™ ways 
to do so (the 2'*^™ is due to the possible choice of the direction of reading each 
of the 1- and 2-paths). 

Thus the contribution of paths P for which m > 100(s — to) can be bounded 
from above by 

sjv sjv— SI m r \ 

^-^^"E E E Ei-^^-"©E 

si = l s=0 ,„>ioo^_ft:=i V / kM 

siN(si - 1, Si - A:i)N(s7v - si - (s - m), fc - fci, to - 1) f \ 
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T2 



V s — m ) mi\m2\m^\ ^-^ V tti 

^ T2<S ^ 

Using the fact that , \ , < i^^xioi/ioo ^^^^ Ea- (^) < 8"/^, 

^ milm2'.m3l — (m/3)! s \KJ — ' 

it is clear that the contribution of paths for which s > ^/sn is neghgible. The 
contribution of paths for which s < y/SN can be analyzed as follows. If tti > Wc, 



-r^"" << r(7ri)*" and is thus negligible. If 
TTi < Wc it is not hard to see that their contribution is at most of order of Zi 
(and thus negligible if tti < iiJc). 

The contribution from paths P (such that I — m) having 1-edges and 2-edges 
possibly read more than twice can be examined by refining the above analysis 
thanks to arguments already used in Section [3] for the investigations of Z3. We 
skip the detail. Thus, one can show that paths P such that I = m satisfy: 

(a) if TTi = > Wc, the typical paths P have 1-edges and 2-edges seen at most 
4 times and no other edge seen more than twice; 

(b) if TTi > TT2, the typical paths have no 2-edge; 

(c) if TTi < Wc, the typical paths have neither 1-edges nor 2-edges. 

Last the contribution from paths P such that I > m that is when some oc- 
currences of 1- or 2-edges in P' arise at some positive level can be analyzed 
using the same arguments as in Section 13.21 and arguments as above. We then 
deduce that when r = 2, the typical paths contributing to E(TrV^") satisfy the 
three preceding conditions (a) to (c). Combining all the preceding justifies the 
universality of the expectation E(TrV"j^"). 
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